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This report describes a pilot study on the development and administration 
of a test using a spatial reasoning problem, the 15-pu2zle. The test utilized 
the on-line capabilities of a real-time computer (1) to record an examinee *s 
progress on' each problem through a sequence of problem-solving "moves'* and (2) 
to collect additional on-line data that might be of relevance to the evalua- 
tion of. examinee performance (e.g., number of illegal and repeated moves* re- 
sponse latency trends). The examinees, 61 students in an introductory psy- 
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chology class; were required to type a sequence of moves that would bring one 
4x4 array of scrambled numbers (start coot igur at ibh) into agreement with a 
second 4 x 4 array (goal configuration), using as few moves as possible. _ 
Data analyses emphasized the comparison of several cethods of indexing prob- 
lem difficulty, methods of scoring individual performance, and the relation- 
ship between response latency data, performance, and problem-solving strategy. 

Subjective ratings of the perceived difficulty of replications of the 15- 
puzzle were obtained from a separate student sam^^le to investigate (1) the sub- 
jective dimensions used by students in evaluating the difficulty of this prob- 
lem type, (2) how accurately the actual performance difficulty of these prob- 
lems could be evaluated by students, and (3) whether there were reliable indi- 
vidual differences in difficulty perceptions related to actual performance 

differences. _ _ . . _ . ._ 

Results of the study suggested that four performance indices might be use- 
ful in indexing problem difficulty: (1) mean number of moves in the sample, (2) 
proportion of students solving the problem, (3) proportion of students solving 
the problem in the optimal number of moves, and (4) a Special Difficulty Index, 
defined as the sample mean number of moves divided by the minimum number of 
moves required. Four alternative methods of scoring total test performance and 
two methods of scoring individual problem performance were studied. The scores 
that took into account differential numbers of moves between the optimal and 
maxiimm_number allowed were related somewhat more to performance ratings ob- 
tained from independent judges. -- - 

Examination of problem performance indices, the Special Difficulty Index, 
and students V perceptions of the difficulty of the test problems indicated that 
mostHof "the prbl)lans were too easy for most students. However , the possibility 
of obtaining a more discriminating .subset of problems was suggested by item- 
total score correlations obtained for, each problem. The data suggested that 
better consistency might .be obtained using problems of similar difficulty lev- 
els, and it tos hypothesized that an adaptive test tailoring problems to the 
ability level of each student would increase the reliability of measurement. 

Mean initial and total "move" latencies for each problem were strongly re- 
lated to some of the performmce indices of problCT. diffi^^ At the level 
of individual performance, only totai^latency or prbblcau solution time was re- 
lated to problem .performance. Latency, data appeared to confound differences in 
the ability to visx;alize a sequence of moves and differences in students.' work - 
styles. Strong evidence for these work styles was found in student consistency 
of initial, average^ and total response latency measures across ail problen^ 

Perceived difficulty ratings showed reliable individual differences in the 
level and variability of difficulty perceptions. The data suggested that the ; 
individual differences found were related to individual differences in ability 
to visualize and to mainta^ a sequence of moves in^ short-term memory, it was 
concluded that m adequate selection of probl^ replications should be able to 
tap these differences, restating in reliable solution performance differences. 

imprdvesnents in problem selection and design were suggested by the data^rin 
this study. Future tests of this type should consist of fewer but more diffi- 
cult problems i particularly problems not permitting reactive » Impulsive solu- 
tions. This type of test would seem especially appropriate for adaptive ad- 
ministration: (15 scores on problems tailored to the individtial^s ability 
TOuld likely :.be more highly related to each dther, resulting in more highly rer 
liable total scores; (2) the motivational aspects of the tests, which seem more 
taxing and potentially frustrating than conventional itsn formats, would likely 
be Improved, and (3) for most testees. equally precise measurements could be ^ 
obtained in shorter periods of time than with conventional test administration- 
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Interactive Gomputer Abmini strati on 
OF A Spatial Reasonins Test 



M(5St researcli on computer-administered testing has snpt^si^ed tb.e_ . 

: aliiiity of the computer to atapt item difficulties to the abiiity level.- 
of examinees. Such computerized adaptive tests have "been shown to pro- 
vide more eqmiprecise measureaent across all trait levels^ ( e. g.i , tale, 
i975; 7ale & Weiss* 1975)-^ to provide generally higher test-retest sta- 
bilities than contentional tests (e.g.. Bet? SJfeiss,^ 1§73, 1S75) , and 
to result in tests of fewer tteis vhile achieting the same or higher 
levels of measureoent accuracy (Weiss & Bet 1S73)^. In addition, re- 
search lias indicated that immediate knowledge of results administered 
to testees after each item in computer-administered tests results in 
enhanced perfBrmance (ietz S Weiss, lS?6a) and favorable psychological 
effects for examinees (Betz S Weisss 1976b ) i Sesearch with computer 
administration of a concept attainment task (Johnson S Baker, 1972) - 
indicated that improved staiadardizatiOR could also be obtained with 

: coapmter test administratloa; and the results of Johnson and Mihal 
(1373) and Finej ehurch* S-iailuea, and Weiss (1979) .indicated that dif- 
ferences in mean pefbrmaace of racial groups might be reduced or elimi- 
nated with cdiputer-administgred testing. "i^, 

'■ |;iffio|'J7|iTr6T~t&e t esting__ha.s_— nzizz^ 

measured intellectual abilities, .and utilized item types that are convs- 
niently nreasttred by •conveatioaal paper-and-pencil tests as well. How- 
aver* oOBiputers would seem to be especially useful in measuring various 

■■ perceptual, c emory , and prob3 reig-SClvi ^ abiliti e s — th^T^i-lize the com- 
puter's capabilities to present novel item formats^ modifying item pre- 
sentation over time in response to the examinee's- performance and al- 
lowing t6e computer to interact with the student while working on a. 
task, it is of interest to determine whether the advantages previously 
found for computer-administered tests* particularly in an adaptive 
mode, can be extended to tests of hew abilities that make fuller use of 
the unique capabilities of the interactive computer. ^ 

Although the use of cooputers to cohtroi the presentation Of 
visual stimuli on a cathode-ray-tube (CST)iis fairly common in psycho- 
logical research* most of this research ias been concerned with the ^ 
discovery of processes of attention, memory, and perceptibh that apply 
to all ind^taualSi -Eeeently* hcweTer, investigators have begun to 
exDlore the potential of cdniputer-admittistered tests for measuring in- 
dividual differences in various cognitive abilities. For example. Gory 
(1977; Cory, Simland^ & Irysdn, 1977) has developed tests for five 
abilities— short-term memory^, perceptual speed* perceptual closure , 
movement detection* and dgaltng with concepti/ information— and ccm- 
pared scores on these tests to conventional paper-ahd-pencil tests^of 
comparable abilities. The conclusion was that these tests provided 

- measures of attributes that are different from .those measured by paper- 
and-pencil tests*- Icr exattle* a "sequential reasoning dimension, _ 
^^htch did not appear in the paper-and-pencil test.s, was identified In 

the cdatotefized tests* Gonputer test administration is also being 



increasingly used "by psychologists interested in measuring individual 
differences in 7aripus_ basic information processing abilities (e.g., 
Chiang & Atkinson, 1976; Etint* Lunneobrg^ i Bevis^ 1975; Rose, i§7S)i 

A comron characteristic of such hew ability tests is that trsdi- 
tidhal psychometric indices of individual performance ( such' as nuaber- 
correct scer^s) and item characteristics (such as item difficulty and 
item discrimination) may no longer be meaningful* Tn measure individu- 
§i_dlfferences in eiaminee perf prmance ^ researchers have used scores 
derived from reaction time data; slope.and intercept parameters relat-- 
ing reaction time to memory set size (Sternberg, 19mh componint 
sccres on various stages or subprocessesderivec from hypothesized 
models _( e.g. ^ diarfe S Ghase, 1S72); and parameter sccres (D', beta) 
derived from signal detection theory. Some, but not all, researchers 
using such measures of individual differences have attempted to demoz>- 
strate the psychometric characteristics (e.g. , reliability) of these 
new performance indices. Such a demonstration is necessary, however, 
for each new score derived from hew types of ability tests before the 
validity and utility of the scores can be investigated. 

Burposi 

this report_describes_a pilot study reporting the development and 
administration of a spatial. reasoning problem, the l5-puzzle, which 
utilized the on-line capabilities of a real-time computer to record a 
testee's progress on each problem throughout a sequence of "moves" and 
to collect additional on-line data that might be of relevance to the 
-eval ua ti:o n=-o f ^_t es t ee _ p erioxmance-*-^-Altiougli--S-pa tial -aii li ty has been ^ 
shown to b*e an important special ability predictive of some job crite- 
ria (xfpr^a summary of predictive validities for various occupational 
a_reas\ between. 192G and 1971, see Shisellt^ 1973], it was also hoped 
that this problem type and others to be developed would be able to tap 
generalized problem-solving and reasoning abilities. 

The iS-'puzzle, problem used in this study involved presentation of 
the numbers. 1 to 15 in a 4. x 4 matrix of scrambled numbers and in a 
target matrix with the.humbers in another conf tguratibni The testee 
was required to move the numbers. in the first conf fguration^ one number 
at a time, to match the second configuration. This problem type was 
chosen because it seemed to tap abilities important in problem-solving 
situations, especially in the spatial domain, while providing the fol- 
lowing additional advantages: 

1. IItili2atibn of the unique capabilities of ifiteractive comput- 
ers i \_ ' 

2. The existence of a well-defined optimal solution against which 
to evaluate a student 'sperformance. 

3. The ease of generating large, numbers of replications of vary- 
ing and relatively controllable difficulty levels. 

If the advantages of computerized adaptive testing are to be_ap- 
plied to tests of this type, precise indices of individual performance ' 
and problem difficulty must b^ devised. Thus, an important emphasis in 
this study was ona comparison of alternative methods for quantifying 
student _ performanceiand a comparison of alternati ve.indices of problem 
difficulty for the 15-pu2Zle spatial reasoning problem.- For example. 
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the number of moves a student requires to solve replicatibtts of the 
15-pu2zle 5ay._npt he_an a-d^equate index : of prohlM the 
minimuKl number of moves fdrP^ various proTblems differs#_ Some of the 
questions studied were, ^Is. the minimum -numher of moves to solution a 
meaningful index of problem .difficulty, of do other physical aspects of 
the puzzle configuration infltaencg_prdhlem difficulty i.s .veil? C^n re-^ 
-sponse latencies he used to qiisistify difficulty and/or individual per- 
. formance? In addition^ to^deterainev^ IS-puzzle task 

couid^he used to successfully measu^ problem solving in the spatial 
domain, the reliability of individual performance scpres_across prob- 
lems of similar and varying difficulty lev^els was examined* 

_ One. further advantage of. the problem type studied here may be its 
, ijiteractive g^me f ormat which may prove :tb be more motivating to exam-^ 
Inees than the usual separate item format'* in addition, the provision 
of knowledge of results may be a built-in feature of these problems, 
since the students can tell when they have reached a solution. On the 
other hand, the need for perseverance and the possibly greater- poten- 
tial, for frustration and anxiety with this type of problem mustvalso.be 
considered* Thus# motivational -data were collected and examined in , 
this study ^in. an attempt to draw some preliminary cbnclusibns about the 
•psychological effects of working on such problems. 

To a. large degree^ the psychological effects_of prpblems of.this 
type on .examinees. will depend on the perceiveddifficulty of replica- 
tions of the problems. It would seem that problems of this type that, 
are inappropriate f or the student's abiiity^ more^dis- 
--co^^g i ng— tian-^tte h t i o na 1 t e s t i t em _ b ec au se th e student 

cannot merely guess and continue with the next item. In problems of 
this type, guessing becomes not a response bias to be eliminated but a. 
trial-and-error. strategy on the part of the examinee. Thus, eventual 
adaptatidii of problems to the student's ability level may be especially 
important for making the testing experience reasonably pleasant and 
nbhf rustratingi ^ ^ 

However^ whether an adaptive presentation of prpblems 'can actually 
equalize the psychological effects of such a test will depend largely 
oil whether students can accurately perceive the_difficultles of the 
items' administered (Prestwq^ & Weiss^ 1977)* Even though some previ- 
ous research has found agreement between perceived and objective indi- 
ces of item_difficulty (e.g*, Bratfisht Dprnic, £ Borg, 1972; Munz & 
Jacobs 197i; Prestwood & Weiss, 1977), it would seem necessary to 
answer this question anew when item or problem types differ signifi- 
cantly. The present study ^ therefore* reports some preliiSi nary lata 
relating to the similarity of objective and perceived indices of 
problem difficulty for replications of the IS-puzzle* 
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Cdm£uter-Adminis tered Problems 

f es t s >i 

gfgbleg dgsgiiliisi* Aseries of spatial l&-puz2les, each a rea- 
soninirproblei7"'were"'adilnistered to students on an interactive cath- 

ER?c ■ : ; . 9 , : 
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ode-ray-tute. (GET) display tersinal. The sequence of prolile!!! presenta- 
tions 2ind the simultaneotis collection of performance_data were* con- 
trolled l)y a cdtnputef prograin written for a Eevlett-^racftard real-time 
minicomputer* . 

__ __ Figure l_shbws a. sample of the. display presented, bs the G3T screen 
while the. student wbrfeed bn each prbhlem* As iig^ire i shows ^ the stu- 
dent was .instructed 'tb type a three-character "sove^' on the terminal 
keyhbard speci'fyjhg which numher in the left pattern he or she wished 
to- move, left J right r up, or down one square, in an attempt to eventually 
hring the configuration of huihers in the left pattern into agreement 
with the pattern of numhers on the right • 

' • figure. 1 _ _ 

Sample 15-Puzzle Prbhlem 

Make your "moves" in this pattern ' Try match this pattern 

iG g 3 7 ie 2 -7 

4 8 6 ' 12 S "S ' 

12 5 2 14 5 4 3 14 

i ii 15 13 ^ i ii i5 13 

Enter your 'move hy typing threi characters and the ''RSTUEN" key. 

The first two characters should he the Sumher you want, to move. 
If the humher has b-nly one digit » type one space and then the. one 
digit nusher.- 

The third character should he: : 

£ - if you want to move the numher one square to the. left? 

E if you want to move the numher one square to the^right. 

tl - if you want to move the numher up one square, - 

I if you'want to' move* the numher down one square^ 

/ ■ - • 

j» 

After each three^character move was. typed ^ the computer processed 
the mbve for legality. 'If _ the mbve was legal, the pattern bn the.left: 
was updated immediately using a cursor addressing system* which allowed 
specified screen Ibdatibns to he manipulated withbut rewriting the 
entire screen. If the three-character mbve was illegal, aa explanatory 
J^error message was displayed, and i^ some cases the student wasjin- 
structed to notify the test proctor for assistance. The testing pro- 
gram detected illegal_mo ve^s of hoth a syntactical (e.gi^ typing errors) 
and a logical (e.g. , trying to move a numberinto an already occupied 
square or heyond thecuter edge of the pat tern ) nature. __ Appendix 'A 
contains a complete list of diagnostic error messages utilized by the 
testing program.^ 

\. Performance data*_ While the_ student wbrked on the prohlem* the 
f bllbwing"'data^were cbllected bh-line hy the computer: 

. li Whether the prbhlem was solved of not, i.e*, whether the stu- 
dent was ahie to type a sequence bf mbves that wbuld make the 
configuration on the left match the cbnfiguration bn the right. 



2. The ntimter of aoyes required 'i^or SD'teiti4?n ( - • 

3- The. numher of ^legal _ motes-, includa^ impossihle^of!es of 
ioth a syntactical and a cosf i^gtiral nattire^ 
pie -1111 m tier of repeated mo ve^/ i .e* * how* many ^times *thi -student 
^ ,*''bacfced tip^** or reversed a possihly incorrfct sequence;- tjf > 

mbveSjto return to an earlier pattern configuration. " 
5i Response latencies, i^e*, the tti&e in seconds *E§quired for • 

each more i . ^ ' - ' ^ 

6. ^he actual sequence of moves utilized* 

The performance data were 'collected for possihie use in drawing, 
inferences ahout several aspects ^of S2)atial prdhlem^solving a'bilityi" 
For example, the number of illegal moves , as* well as the initial re- 
sponse latencies,^ might index the student 's initial ability to deflnt-^. 
and to clarify the tasfc situation* The sensitivity ^f_ students _ta the' 
task information provided i in this case, the pbntinually. updated. left . 
pattern and its^ relationship to the right pattern) and their ability jte 
plan .a se:quence of moves might he indexed ty the-numhef of honopfriQal'' 
nsbves^ the number of repeated mbves^ and the total number of moves^fe- 
quiredi istwdest's inability to recenter (Sweeny, 1955; Wertheimer, 
1959) _or the , presence of a _debili tat ing set might be inf;erred_f rpjn a^ . " 
persistent sequence of moves that did not bring the start pattern 
closer to' the goal pattern • • - r 

_ The pattern^bf response latencies as the student approached tie 
solution might al>b be useful ihfermat^on in. making inferences about a 
"student 's prpblem-^solving strategy* Tor example, in the initial stages 
of _ the problem, a planning-ahead strategj' might be inferred from longer 
initial response latencieSt Shd a more impulsive^ reactive strategy or 
problem^sblving style wbuld be assbciate^ with shorter latencies • If 
the student was sensitive to the: relationship between the two stimulus 
patterns^ a shbrteaihg of the response latencies might be expected as 
th3 left (start) pattern approached" the right (goal) pattern , (Hayes, 
1965)- : • : 

• Individual differences in_ the ability, tb visualize br tb maintain 
sequences of mbves of varying lengths in shbrt-^teriH memory might also 
be; reflected in the patterns of respbnse latencies^ For example^ art 
indiTl^^^l with a greater ability to maintain a sequence of moves in 
short-term memory might need longer pauses or study points only once 
every six or seven moves as .opposed to every three or four; moves. 
Isolation and interpretation of such differences may be difficult, how- 
ever, since momentary differences in short-term memory capacity may 
also reflect differences in. the allocation of limited cognitive re-- 
sources (Norman, 1976); 

' * ■ _ 

Test 12:1 inlstraiign 

__Sixty^bne studehtsln an intrbductbry psychblbgy class _tbbfe_ the 
problem^sblving_test^ rOf these* tests for five students had tb be_dls- 
carded because of computer problems if terbeinglq^ onto the GR? 

a.'test monitor, the student was presented a series of ins^uct local 
screens by the computer. The text of' each instruction screed is in . 
J.t)p_endix E._ The presentation of^instruction. screens. wes student paced, 
vith_the stMent pressing the "SPfiGE BA3J"_ and "SETTJHN" key on the ter- 
aitnal keyboard tb proceed to the next ihstrtctibn screen. 



is. AT)t)endix.3 shpvs^ tile isstructions .first_tqld ciie student how 
■to titilizl^icpbrtant keyboard cfiarac ters , . such as the RETUSM Jcey» to 
enter responses • Next ^ after descrihing the iS-puzzle task and pro- 
Tiding instructions bh entering a three-chara-cter 2ioire> the . 
instructions told the stmeht how to correct a sistyppd _ro.Ye /before 
transmitting the Sove to the computer^ Siigraphical inf'dfiati on * 
including. name , student identification nuster^ age, sex, year in _ 
school, major field of study, race, and grade-point average, wcls then 
req^iested from each student* The final instructional screen (see 
Screen 16 in itipendiz B) was' intanded to standardize the desired: 
aotivaticnal sit for each student* The_ student was then presented with 
a practice -Drohlem. This. practice pfdhlesi {rrqhlem 1 ) _was_very simple, 

requiring only three straightforward moves ,_ and was used to allow 

students to clarify questions ^nd to gain confidence in entering moves 
under hohtesting^ conditions. 

Following the practice prdhlem^ students were presented a_ maximum 
oi 12 tjrohlems (Prchlems 2 to 13)^ These problems varied in difficul-_ 
ty^ which was initially, indexed hy the minimum numher of moves . required 
for solution Lsolution path length) using a solution algorithm provided 
hy Nilssbn (19715.* Sach of the 12 problems consisted of one problem 
requiring 4 and 6 mbves and two problems for each of the following so- 
lution path lengths: 8, 10* 12, _ 14, and 16. _ The 13 problems used, __ 
along with their solution path length and .ojtier physical problem char- 
acteristics, are in Appendix G. 

_Pata for all_stuaents were hot obtained for "all. the prbbleos fbf .a 
variety of reasons . Since the studehts differed. in both sblution effi- 
ciency and in the- amount^of time they h^d available to participate in 
- the study, not all students cqmgle te5._ail th^ prbblemsi In additlbn* 
after about half the students [the first 33) had completed the tests* 
it appeared that a test consisting of 12 problems was somewhat too Ibng 
and that .sbme. st:udehts' did nbt.have enough time in the experimental 
hour to finish_the_ longer problems.. For this reason, two cf the easi- 
est problems ('Problems 2 and^ 3h which everyone seemed to_be_solving in 
the minimum number of mbves," were eliininated tp.make the test^^ shorter. 
Finally:, in a few cases, data for a single problem were lost for a stu- 
dent due to- computer problems. 

There was no fixed time limit for each problem. However, in_order 
to prevent. a student from spending too much time on a single problem to 
the^ exclusibh bf others* a ;>mess_age_advising the student to notify the ^ 
test prbctbr was displayed Ion the. terminal screen after the student had 
bgjsn working bn- a problem for what was thought to be an unduly long -s^ 
tfmei The maximum time" allowed for each problem was a_multiplicative\^ 
function of the minimum number bf mbves required, ap to a maximum af Ic 
mihutesi For example, about 4 minutes Were. allowed for a problem re- 
quiring 3_mqves, about 46 minutes for a problem _requi ring__3 moves, and 
about 15 minutes for problems requiring 12 to 16 moves. The. proctor 
then had the option, of :aavancing 'the student to the next, problem, or 
r.esetting the "orobfem timer to allow the student to continue wbrk bn^ 
that' prbbletn. '|tudents were enccuraged to discontinue wbrx bn.a.prbb- 
ien unless they. felt confident they were near solution and needed bnly 
a. little .mbre time. ^■ 
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] Similarlyt .the Student was stopped when he or she had taken the 

; jaaximTiffi numher of moves allpwahle for a prohlemi The maximum- numher 

moves allowed hy the computer^ was also a function of the minimum numher 
' :«f moves (sftlutiqji path lengthL reguired_ to solve, the prbble fhe__ 
. maxiffltim numher of moves, was defined as the solution path length times 
:'*3*5? if the maximum numher aOfLmovis was greater than 28 ^ the maximum 
move limit was_ set equal tf 28. This maxi mam was intended to terminate . 
iJdrB: oh a prohlem the student appeared uhahle to solve . so th^t he/she i 
c wo^ld-^ proceed to su^ The numher^of moves it would 

"t^e to recover "from nonoptimal moves was taken into consideration in 
specifying this initial maximum move limits 1$ was reallzedt however; 
that this maximiiiir. limit might have to he adjusted once actual perfor- 
mance data were phtained* ^ 

— _Thje maxtmnm humheF of moves allowed was ihcreased for ahout half 

thestudents to determine if ^studentscpuld^f each^solution if they were 
.given more moves. Thirty-three students were_ limited to 28 moves for 
the longest prPhlems and the remainder were allowed $3 moves* The 
larger move limit seemed to allow more students to reach solutions for 
some of the idhger; prohlems* 

A Student was permitted to ^v^ terminate a p^h- 

ziem^^ hefore the solution "w^ the test proctor to ad- - 

^^raiicg h\m/ >iPr tn t>iP nP-rt pr.nKI ptbl ^ Ln— tile— fe«-^i41 s4a^^ ^ 

• Situation arose^studehts were encourage work cn a prohlem 

: joiiless the time liinit message had already appeared. 

fhei^-tke student .successfully cd^ pfohlem hy matching the 

^ stalrt ,and goal pattern: the computer displayed the message: ' . . 

, • Cood*_^tou have succeeded in matching the two patterns. Press the 
- "space" har and* "SSTUEN" to start the next prohlem* 

Test Section Sata * ' ' ^ 

•- — — — ^ _ . ^ ^ . 

Upon completing. all-.th^ test prohle message thanked the stu- 

dent, for -his/her participation. Students then completed a paper-and- 
pencil questionnaire providing informatipn on prior experience, diffi- 
: culty perceptions^ and other inptivatibhal questions that could /be used 
to evaluate student reactions to this type of/testi ^ . 

• ■_ _ • ___ 

Since a general measure pf spatial reaspning ahility was sought, _ 
individual differences in test performance should n'ot he accpunted .for 
hy specific prior experience with. this_type of puzzle^ Therefore^ the 
first questtbn asfced the studeht^how often he/she hid worked dnthls 

iJclnd of puzzle in the past ^ |h order to evaluate the clarity of the 

: instructions for this new type pf test item^ the second question asked 

students_how' much difficulty theyhad inunderftanding the instruc- - 

tlons. .Because this. was the first time this problem ty|e had ITeen used, 
on this student popuiatipn^ it waSv not known. before data collection, how . - 

Tdifflcult puzzle replications would have to he to challenge the Jstu- 
dentSi Thus^. the third question obtained information on how difficult 

:^^he students thought the puzzles in- the test were^ 

_ It was felt:that the_ student 's mptlvat ion level duri 
^/^«otlld be especially important for performance on problems of this, type; 



^ 8 ^ 



which_ require More concentration and within-protileni perseverance than . 
more typic_al single item formats* Consequentlyt Question. 4 asked, stu-^i 
dents tdw hard ttiey tried to solve each puzzle in the bptisal numher of 
ifioveSt and Questibn_5 aslced whether the length of the test affected 
their Ebtivatibn* Students indicated how nervous or uncbrnfortable they 
were while working on the puzzles in Question 6* 0vera.ll evaluations 
of how well they thought they had performed and how well they enjoyed 
working on the puzzles were provided hy students in Questions^? and_ S* 
respectively* Any further comments. the students had were elicited hy 
Cuestibn 9. Since all Puzzle Beactibn questibns referred tb different 
'Cdntentf no scores ^ere derived across items* . ' 

gata Analysis 

l5^iS§§ 5I pgohlem difficulty, pata- collected for each prohlem 
w_ere_ usid"'tb discrihe pfbhlem difficulty in several ways.. For each of 
the 13 prdhlems (12 prbhlems plus 1 practice prbhlem)^ the frequency 
and proportion of students requi ring various numhers of moves to solve 

or to fail to solve the prohlem was calculated. The following we 

also computed for each of the 13 problems as potential indices of proh- 
lem" difficulty: ' ' 

1. . The. mean numher bf mbves tak'en. This was the average numher 

of legal moves used "by the student to sblve the pfphlem or the 
humher 6? moves at which the prohlem was terminated due to 
using too many moves or too much time. Since the move limit 
was extended from 28 to 43 for ahout one-third of -the stti-^ 
dentSt the mean numher of mbves was slightly, lower . for the 
Ibnger puzzles thah^ it would have h gen had all students heen 
^ - allowed, the larger maximum numher bf. mbves^_ ^ _ 

2. : The prbpbrtibh of stud.ents sblving the prbhlem within the 

original maximum numhef bf moves (i^e^^ for the longer puz- 
' zles, 28 moves. ) • _ • 

y 3. The proportton of students solving the prohlem in the minimum 
or optimal numher _of_ moves. 
4. The mean numher bf illegal _ mbve^; . 
5i The mean numher bf repeated mbves i 

. In addition, for each prohlem .a Special Difficulty Index wa 
puted^ defined as the mean numher of moves used, divided hy the minimum 
numher of moves, required. .{solution path length).. This index was de- 
signed tb prbvide a pbssihie. difficulty index.that Wlis cbrrected_f br 
differences in minimum, solution path lengths fbr each prchlem. Fbr 
examplet_a prohiem requiring 16 moves may not "Be more difficult (in the 
sense that_ nearly everyone could solve it in the ^minimum numher of 
moves) than a prohlem requiring only 10 moves. 

_A pbssihle advantage of the relatively. formal nature of thelS- 
puzzle is the availahility_ bf _pbtentially oh jective.phy^cal.prchlem. _ 
characteristics f which cbuld. f unctibn as pbtential indices bf.task dif-v 
ftcultyi One such index^sq^ length (i^e*^ the minimum, 

numher of moves required forsolution), has already heeh mentioned i 
Several other indices relating the start pattern to the goal pattern 
were computed to determine if they related empirically to -the actual 
difficulty in solving each prbhlem as indexed hy student performance. 
If such a relationship was fbund^ these physical indices could he used 
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in' selecting prfl'blem retlications for inclusion in a test on the basis 
of their predicted difficulty; - 

The following physical prphlemcharacteristics of each pair of 
patterns were considered as potential difficulty indices: 

1. Path, length: the miitiinum nuinber of mores required to sol7e-;the 
problem.* 

2i The number: of squares nit matching in the start and goal pat- 
terns 'at the _start of the problem (maximum = 15). 

3. The ntxmber of rows disrupted cr not matching in the two pat- 
terns (maximum = 

4i The ntmber of columns disrupted or not matching in the two 
patterns (maximum = 4)* - 

5. Suclidean distance function: the sum of Jhe distances^of each 

•number's position in_ the start pattern ftqmits position in 

the goal pattern, using the Pythagorean theorem (i.e-t diagonal 
distances allowable).. _ 

6i City^lock distance function: the sum of the distances, of each 
humbei^'s position in the star^ pattern from its position -in 
the goal pattern with only vertical and hbrizdhtal (hot diago- ^ 
nal) displacements calculated i . 

Appendix, CVsh bw s _e a c hTp f these physical problem characteristics for 
each of the. 13 problems^ . ; _ ' 

Assessment of st ud e nt iggfgrgaihce. EeriTing scores for a student 

on a sinili^prUtIim7"^and pn"hii"ype_of tes^^ whole;,_is_complicat- 
ed by several factors. Tor example, some students were not able to 
worS oh the test as long. as others; some _ students naturally wprSred 
faster than others? and in afew cases # data on isolated problems was 

" lost hecause of cbmputer failure. In addition,- half ' the_ students did 
not work on Problems 2 and 3, since these were eliminated to shbfteh 

; the testi 

As a result, scoring a student;'s pefbrmance merely by the number* 
df problems solved was_nqt only undesirable from a theoretical point of 
view-^ut it was alsb impractical due to the above confounding factors^ 
Ibr vthis .reason^ and aisb from the_point ofv view of using these .prpb-- 

- lets in future adeptiTe testing, it was desirable to devel_op_ scoring 
methods that did not depend__on the particular problem replications on 
which the student, worked. _ This suggest e%^ using such measures as the 

: prbpbrtibh of problems vbrked on by the student : that he or she was able 
to sblve or the: prbpbrtibh bf prbblems attempted- that the student 
solved in the optimal number of mbves^ However^ these measuresdb hot 
take into account the differential difficulty bf different problems br 
individual differences inthe number of moves used between the eptimal 
and maximum allowed number. Using the number of moves a student made 
on a problem would not take into account the differential solution path^ 
lengths and the difficulty of _ problems _. Poten.tiai measures that would 
t^ke ihtb accblint the difficulty of various problems, _such as [the mean 
difficulty. bf_ prbblems solved or the highest difficulty prbbi^^ solved 
in the Optimal" number bf_mofes,, would not be .comparable for students 
who did'' not receive prbblems bf the same difficulty level. : 

Taking ihtb cbhsideratibh all these problems* two methods of scor- 
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ing sttideiit performance on individual proljleEs were devised: 

1* Score' 1 = -^^-t^e^^^^ber^ 

the fflinlaraffi number ef moves act^Elij" req-uired* 

For example t if a student took 15 moves to solve Protlem 6i 

viich required 10 moves, his/her score was 15/lQ = 1*5^ Since 
a perfect score would he liO, this student required 50% more 
moves than were necessary. Note that although this score_ cor--^. 
reeled for different solution, path lengths of various proh-;_ 
lems, it did not take into_account the difficulty of_ the proh- 
lem as indexed liy the total group's performance on the prbh- 
lem. - 

2i Score 2w- This score was Score i adjusted hy the Special Bif- 
ficulty Indexi fhnSf - 

Score 2 = (Score i)/(Special Difficulty Index) 
This score reduces to 

Score 2 = the numheg of moves the student used ~ 

. the mean numher of moyes required by the total grdupi 



Tiust if Score 2 = t*Q^ _the student ''s performance was equal to 
the group average- If Score 2 was less than l.S^ the student 
solved the prdhlem in fewer moves than the average student? 
conversely ^ if Score 2 was greater than 1^0, the student : 
solved the prohlem in more moves than the average students 

To determine whether these specian were any more 

meaningful than more direct scores, such: as the proportion of prohleass . 

solvedf the relationships between the following four scores for the. 

test as a whole were examined: { 

!• PEOPS = the prdpdrtioh of pj-ohlems. that. the_ student atteapited 

(worked.dn) and solved within the maximum numlier of .^^ 

moves J28)* - ' ; " 

2i PH0PM = the proportion of prohlems that the student* attempted 

and solved in the minimum (optimal) numher of moves i 
3* Total 1 = t^e average_Score i obtained oh the, problems the 

student attempted. 
$• Total 2 = the average S^pre 2 ohtalffed on the problems the 
student attempted. 

It was hypothesized that the TiJtal 2 score would, prove to be tiie most 
meahingful scdre^ since it tddk intd_ account bdth the sdlutidn path 
length and the difficulty of the prdble'ms the. student _ attempted* and did 
ndt^ depend on the number df prdblems attempted. _By ad^ustiEg fdr prdb- 
lem difficulty ,,ja student was peha-lized mdre by Total 2 for less than.^ ; 
optimal solutions on easier prdblems thai dn more^ difficult prdblemsi * 

Consistency of performanc e across grobi§ ms >_^An important question 
for determining the usefulness ""of ""this p?^^^ spa- 
tial problem-solving abiHty was w^et^her reliable individ^ 
ences on various performance criteria could be identified across prob- : , 



lem replications of simildr andyaryingdifficultylevels* __To e 
this, cjuesiiOG, the.cpnsistency of the various performance scores was 
exafflised.acrbss all 13. prohlems using Pearson product-mofflent correla- 
tibnsi_ Since "both of the individual prohlem scores (Score 1^ Score 2) 
were linear transf oricatidns of the optisal nuifioer of movesf the consis- 
tency of these scores across pfohlems in terms of Pearson prbduct-mb- 
ment correlations would he the same as the stahility of the numher of 
moves usedi Thus., the stahiiity of the following performance indices 
were examined: 

1. The total nuiSher of legal moves used for each prohlem, 
2i The numher of illegal moves, and 
3i The humher of repeated move-S. 

The relationship hetween individual pr^hjem scores and total 
scores on the. prohlem set as a whole was investigated hy examining the 
cdrrelations hetween_ ind-ivldual prohleni scores. (Scord 1, Score 2) and. 
total test scbres_(P20PS, PHSPH, Total 1^ Tbtal 2) with and without the 
particular prbhlem heing excluded frbm^-the total- score. In additibh, 
the relationships hetween the total numher of legal moves used (br^ 
eQ:Uivalently i Score 1 and Score 2) , the numher of illegal moves , and 
H^^e hUfflher~-i>f— rep^ were examined hy comput- 

ing the Pearson prodTictrmomen^t^orrelations hetween _pa these per-- 

fbrmahce indices- across students for all pairings bf prbhlem replica- 
tttfns^ 

. ~ ^iifgglg ^^^gSr cies . luring testing the time in seconds taken hy a 
h'student^f or every move-_w^ This ailcwed la- 

tency trends across moves . to he plotted and studied for- each problems 
^^Thi^ee indices were' used to. quantitatively characterize a student's rer 
5*5xjbnse latencies for a prbhlem: 

-"^ ^ ■ / _ _• - . • . • ^ - * - • " 

li ' Initial move latency^i i^e^^ how long the student studied the 
: : . initial prohiem configgra.ti on hefore making thejfirst cbvel 

' 2. , The average move latency,.' i .ei, the average time taken iror a 
_ move, across^ the p^rtic^iar prohlem; -;and _ 
.^M.^y^Z^^ Tbtal -prbhle^Vl^t.^hpy< i*e.^ the total time in seconds taken 
> hy the studeii^; vbh" a^^^ p prbhlem. . 

^fa^vorder_t with the prbhlem 

d^f^ljculty as indexed' hy^^vai^ous performaiffre measures, the mean of the 
ahdVi three latency m€a'Sti8?es was computed across all suhjects for each 

i^bhlem* 

4i^hbugfi tendency fbr various performance measures (e..g»* the 
numher of -moves neeleif^ta cor the relia- 

^tility of prohlemr^olving performance, the tendency jfor a student 's 
Response latencies to show consJ-stency across problems may indicate a 
i^bghitive styie4_e*g*t reflectivity versus impulsiveness (lagan, 1965; 
Kagah et al.^ 1S64) br.a strategy of planning-ahead.v&rsus trial and 
error of impulsive responding^ To study this possibility^ the cbnsis- 
t^acy-ofithe initial^ avera^ total fespbnse latency measur 

^afexbss problems was examined using* Pearson product-moment correlations i 

I^br eiamp^l^^ byjcorrelating the: initial move latency acro-ss students 

sifpr each paip^ of problems, it_ could be determined whether^ spme students 
Inconsistently studied 'feach . problem: for longer or shorter tim^s than 
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other students • _ Similarly t_ by correlating the total pro olea. latency 
over students for each possible problem pair, it could. be determined 
whether the same students who took longer or shorter "tines to solve one 
problem also did so on ^he other problems* 

it was also of* interest to examine the_ response, latency trends as 
the student progressed throughout each. prdblim. Such trends may indi- 
cate the degree of initial planning^ the number of notes a student made 
between study points^ and the point at which the sequence of' moves to - 
sbl^utton had been detected* ?or this purpose latency graphs for indi- 
vidual students showing the response latency for each move from- start 
to solution were plotted and .inspected visually. latency plots were 
examined for students who had performed well on the test_and_ those who 
had performed poorly and for problems solved and problems unsolved, i^i 
arder to detect any systematic differences in latency trends. 

gelatlgnship between perf ormance and response latencies . In order 
to determine if any relationship existed between students' performance 
on the problem and the way they allocated their time on each pfbblemi 
Pear-son product-moment correlations were compu for each problem be- 
tween the initial^ average, and total move latencies and the number of - 
moves each student used. In addition, correlations were also computed 
between total test score, which bett er inde xed the student ^s perfor.TL 



"^nce on__tBl"test as a whole,, and the initial, average^ and total la-__ 
tencies for each_probleffi. lor these correlations the total ^, test _ scores 
used were Total 2 and a mean judges' performance rating, described be- 
low* , 

Judggs^ £l pgi:fqr mahce*. Because ' reliable external crite- 

ria agatnst%6ich the student per? bfmahce scores could be validated 
were not available^ each .s tuSent 's perf ormahce bh each problem was 
studied independently by three judges and ^each student's ove test 
performance was rated on a lO-point scale, with 5 being anc to 
average pr mean, performancei_ con^sidering the sample as a whole. The 
mean of^the ratings of the three Judges (MEATE) was used as another 
index of student total test performance. ^ 

Since* the .judges were f amillarwith: the difficulty of each problem 
and CGul^_ carefully examine the student's performance cn.each problem, 
tt was felt that these -.ratings would prbvideva more complete assessment 
and rank ordering of' student performance. Although less subjective^ 
the performance scoring *me^ abbve were hot equally able 

to take into account all t-Jte information that the judges cbuld in their' 
ratings. Thus , one way to' compare the adequacy or refinement of ^the '_ _ 
various. scoring methods was to compare the rank ordering of students by 
each^ methbd with the rank ordering _assigned_by the judges' ra^ 
This was done using Spearman rank-order correlation coef f icientSi 

To -determine hbw well independent judges could agree on thera- 
tihgs of student perf brmah^ce^ ihterrater reliability as estimated by 
the following form of the ihtraciass correlation was used: 



MS J- - + (K-i) MS^^^^ 
students ^ error 
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•where the various mean squares (MS) were derived from a standard two- 



wayanalysis of variance- and the Sean square for error tera represented 
variation due to tfee interaction of sttidents.and jtidges* Note that 
since only the reliahility of the rank brder.ing of students # and not - 
mean level of dif f erencesof judges' ratingsl was ef interest (i^e*^_ _ 
interrater reliability versus in terratef agreement ) # ihe error term did 
not include variation due to judges (Tinsley & Weiss, ^1S75) ^ 

?3£iil§li5Sil IS^ hiogfaphtcal data The frequency and percentage 
6T students indorsing various response alternatives to: questions in the 
Puzzle Re acti on Que stionnaire^ completed at the end of testings Were 

— t aDul ated in or d er t o determine students' prior experience with this _ 
prq'biem typet the perceived difficulty of the_ instructions of the test, 
and the motivation and anxiety level of the students during the test* . 
Completed posttest questionnaires were ohtained from 50 students. _A1- 

. though the responses to the Puzzle Seactibn Questionnaire were analyzed 
and provided useful information on the motivational characteristics of 
the total groups the small ttum"ber of students distrih 
over various response categories made group performance comparisons 
hetween students, in different response categories inappropriate for 
many of ;the questions. 

Gne exception was Question 2, which was especially important he- 
cause It involved whether pr evious pr actice with frohlems of this, type 
would affect test performance. The_relationship hetween a student's 
prior experience with this prphlem type and his/her test_ performance 
was determined hy_performing t tests _ on the differences in mean total 
score (Total 2, MHATE) for those student.s> reporting little or no prior 
: experience-. wtti this prbhlem type versus students reporting much exper- 
iesncei. . • 

Since_prohlems of the_type us^ require higher 

levels of motivation than more traditional psychometric measures^ it- 
was _alsb important. td_ investigate the effect of motivation level on : 
performance with the limited data availahle. Forthis purpose. t_tests_ 
were performed on the performance means of students reporting different 
levels of mbtivatibh in Question 4^: 

In addition, since males as a group have generally heen found to 
score higher than' f emale-s as a group on tests of spatial abilities 
(Garai & Scheinfeldt 1968J HacCoh^ S Jacklin , ^1974} ^ it was . of interest 
to determine whether sex differences existed for this test. :Thus, a t 
test was used to compare >the male and female group mean total. scores. 

Pe^c^lvgd difficulty EiliS.S§ 

• Suhjective\ratitrgs-i5-f-:^ dif ficulty of replications of 

. the 15-pu22le were ohtained from a_ separate sample of students in order 
to investigate the fbl'lbwing questions: 

• i. What' suhjective dimensions do students use in evaluating the 
' j^.-v dlfficulty_Df this problem type? ■ ,^ 

2. How accurately_ can students evaluate_ the_ actual^ dif ficuity of 
theseprohlems? That is* do dif ficulty ratings agree with 

" actual performance data? How: finely ban discriminations he 

^ made hetween pfohlemsbf similar difficulty levels? _ _ 

3* Are there reliable individual differences in the perceived 

O 
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difficulty df t&ese prbbleas and in the utility to make finer 

discrifflinati'bns? 

xhe latter two questions, in particular, address indirectly the ques-_ 
tion of whether students "' perceptions _ "be related 

to their performance i For example^, to the extent that reliahle indi- 
vidual .differences in the ability tfe visualize a se^iuence of moves in 
. short-term memory exist, this night he expected to result inreliahle 
differences in hoth perceived tasi difficulty and in actual tasfe per- 
formance* 

To maximaliy associate perceived difficulty with actual perfdr-* 
mahce^ the same students would ideally make the ratings and solve the 
pfehlemsi Bue to limitations in student time# this was net possible in 
the present study? instead 9 a second sample from the' same population 
was T2tilized^ Using separate samples for the two tasks has the^advan- 
tage that a student s rating of prohlem difficulty would not influence 
or he influenced hy actual performance on the problem. 

Pro cedur e 

Subjects* i total of 47 students from ah ihtrbductbry level psy- 
chology course rated the difficulty of 67 stimuli^ Each. stimulus con-* 
sisted of a typed star t--and-goal cbhfigurat-ibh f^ 15-puz2le oh ah 

index card* To shorten the length of the rating task for each student, 
. ^the. e? puzzles were divided into 4 sets of 16 or 17 puzzles, each and 
the 47 subjects were randomly assigned to one of the 4 puzzle sets ^ ; 
Since the. students_were divided into groups merely to shorten the task, 
analyses were generally carrled^out for the sample as a whole; thus* 
the results will not be discussed separately for each group. 

aP^ _ * Data foF three students were not included in the analysis because- 
o they failed either to perform or: t<) record their ratings ^n accordance' 
with instructions^ Students took ah average of about 40 to 45 minutes 
to complete the rating taski 

guzzle stimuli > Selection of the ^67 puzzles used in. this study 
- was done^'with care because the^ to be used in several ways* For 

example, in order to be able to'itrace the perceived difficulty trend 
within a single 5uzzle_Cwhich_fflightrequ start to 

* goal)^ rating's were obtained for several puzzles with the same goal 
configuration. but with start configurations that conv.erged on the goal, 
is a result^.it was possibli to detect how many moves from the goal a 
student would have to be befiire the problem would begin to look some- 
what easy # theh-r easy r and $6 di. 

Since one hypothesized difficulty dimension was_ that of path 
length (or number of moves required) i puzzles .utilized a relatively 
uniform continuum- of path lengths^from 1 to 26* Of the 12 problems 
used in the problem-solving performance portion of the study ♦ 9_were 
included among the stimufi rated in the rating taski Of these S, 4 

^were divided into_subpuzzles_of varying lengths ^ as described abbve^ in 
order to examine the perceived difficulty trend within the individual • 

'^problems • - 

_Ha ting' procedure. Appendix 3) contalns_a copy of the self-admihis- 
^' terei'"ihftructrbh''Ind recording booklet that each student received i 



students were told hoV t&is type of prbbles was_ sblted. so tSat t&ey- 
cquld rate how difficult t&ey tfieug&t it Would "be if tfiey Saii to solve 
it* " Students_f irst_sorted the puzzles into six categories laheled Very 
rifficult. Difficult, Somewhat Difficult , Somewhat Easy, Easy^ and Very 
Easy. It was made clear tc the students that ^there were hb required 

numher of puzzles tojbe scrted intpany of the piles hut that they 

shdtild put _tach puzzle into the category that had a lahel hest descrih- 
ing how difficult they thought the puzzle would he tosolve^ In each 
puzzle set fbur bf the puzzles were specially? sel-ected ahead of time to 
range from Very Easy to Tery difficult , in teres bf path length. These 
four puzzles had a special message oh the index card tnstructihg the ^ 
student to'proTide reasoas, or a hasis, for sorting the stimuli_into a 
specific category. These reasons »_ along with the posttask questions 
(see Appendix D) fegardiig what rules or criteria they used for sorting 
into each of the six categories^ constituted the protocols that were 
lateranalyzed to 4?teraine the dimensions on which the students 
thought they were sortings 

After recofdihg the puzzles that were sorted into the original six 
categories^ students' were^asked to attempt to hreaS: down each category 
into sutcategbries' hased bi finer difficulty discriiinatibns. The stu- 
dents were encouraged to subdivide into as many subcategories as they ' 
could; but only to d-o so if they felt they couli differentiate the dif- 
ficulty of the puzzles -in the same category. No re-sorting across the 
original six categories ?as allowed. After recording the stimuli in ' 
each of the final subdirided categories, students responded. to a ctues- 
tibnnaire that gathered tnfomation about their prior . experiehoe with 
this kind bf puzzle, whether they had difficulty understanding the 
task, and their mbtiTatton level during the study* More important ly^ 
students provided their own rules or criteria for sortingiinto each of 
the categories, for example, how they distinguished .a Very Easy from a 
Somewhat Easy puzzle. : - 

On the lastpage of tixe bobilet^ and. after the. students had .al- 
ready to |uhte"ered their own. rating dimensions or rules^ a list of nine 
dimensions was. provided, whic^ were hypbthesized to be related to stu- 
dents^ ratings. Students were asked to indicate for ea of the nine 
dimensions whether they considered it in all, most, some, or none of 
the puzzle ratings. These nine dimensions also included_two dimensions 
that were supposed lb serve as validity dimensions (see Questions 8c 
aha^ Sf in Appendix D]i It was felt that these dimensibhS (particularly- 
8f) would be irrelevant to perc^ and would therefore 

serve to detect students who were randomly respbndihg or feeling th^ 
-tiey should have used every dimension suggestetd by the experimenters 

Analysis . . 

Eejortid dliension^ o? difficulty* Self-reported dimensions of 

perceived"!!?? iculty\wire thus"'of two types in this. study. Firsts stu- 
dents voluntarily provldred the basis for their difficulty^ judgments 
during the sorting" task^K_puring_this_pprtion o task, students _i 

were provided no infprmati^n^as to the dimensions to be used in making 
their judgments. After _sort"5^ng_the_puzzles into piles -representing 
different perceived diff icul tyMevels, .an experimente list of 

pdssihle fating d.iffiensi6as was p^^ided and students indicated whether 
they usect each dimehsiba <)n ali^ most^ sbme^ of none of the problems.' 
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For each type of . self-report (the voluntary prbtceoi^ and the. ex- 
periment er^prbvided dimensions) ^ the propbrtidn of students 
use of each_dimensi6n was calculate was made of 

the most frequently used or important rat Ingdimens ions. Judgments. of 
which dimensions were heing reported during the sorting task were ma'de 
hy one graduate and one undergraduate research assistant and invdived 
studying the students written responses to the_"Prb'vide ybur reasons" 
section of the -rat ing hbbklet (see Appendix Step 1) and Questions 5# 
5, and 7 in the pqstrating task questionnaire (see Step 4 in ippendix 
S). Representative protocols provided "by the students t indicate use 
of each reported rating dimension are contained in Appendix S. 

Perceived difficulty illS Scale values represent Ing mean 

percerved^dr?flcuIty''were_D5tilned^f rom the final stihdivided category 
sorting of the puzzles* The centerpoiht of each bf the original six 
•categories was assigned the numhef 5^ 15, 25^ 35, 45, or 55 for the 
respective categories 7ery Easy, Easy, Somewhat Sasy, Somewhat Diffi-^7"' 
cult, pifficult, and 7ery_Difficult* When, puzzles within one of these 
six categories were suhdlvided int o"' suhcategoriest the fiv§ integer 
intervals on each side of the center point were. prorated or divided to 
assign, differential rating values to each puzzle^ The mean rating 
across all students was then computed to qhtain the subdivided scale 
vaiuesi These suhdivided scale values were then divided hy 10 to scale 
them from i to 6, thus making, them comparable to the original category 
labels. Thus, a puzzle felt to be. Very Easy by the average student" 
would have a scale value in the range_of about .5 to 1.5, an Easy puz- 
zle's scale value wbuld range frbm 1.5 to 2.5^ and sb oh: 

These scale values were then used tb determine the range bf prob- 
lems (e.g.^ problems requiring three to six moves) perceived to be in 
each of the categories (eigi,7eTy Easy, Easy) by plotting the scale 
values , versus, the solution path lengths of the puzzles. Finally, the 
relationship between. perceived difficulty and actual performance on the 
set of _ puzzles administered to_ the f students was investi- 

gated by correlating mean difficulty ratings, with the performance and 
response latency measures bbt^ for the nine puzzles that were in- 
cluded in both the performance and difficulty rating portions of this 
study. ' " . - 

Relationships, Between Objective and Stib.i ecti ve Difficulty Indices 

^ E^ch ^of _the ,pexf phys- 
icalprotlem characteristics^ and the perceived difficulty mean ratings 
can be _considered_ potential problem difficulty indices^ For example, • ^ 
the difficulty of a problem could ,be indexed in several ways: (i) by 
the prbpbrtibn of_ piersdns solving. it ^ (2) by the average response la- 
tency used in working bn the prbblem^ br_ (3) by the number of squares _ 
needing to be moved large distances in the pattern. The similarity bf 
^the rank orders of varitms objective indices will likjely vary. 

_ In addition, the rag.k o.^derlngs of the problem difficulties by _ 
perfbrmance br_ physical indices obtained in the first part of the study 
can be cbmpared with the rank brdering. of subjective (perceived) diffi- 
culty obtained in the second part of the study. Fbr this. purpose # the 
Spearman rank-bfder cqrreiatibn cbeffieient was computed between the 
rank orders of problem difficulty provided by all performance^ latency # 



pSysicali- ana perceived difficttlty iaiices* Some of the questions ad- 
dressed ttrbugli examination of these correlations were as fisllbWs: 

1. ro_the performance criteria used^^i^ (mean number of 
moves, proportion solving the problem, proportion solving it 
in the minimiim ntimher of movgs) similarly index prohleffi diffi- 
culty? _ - _ _ _ 

2. Ho prolilems that take the most total time to solve or that, 
require longer average move latencies also involve longer ini- 
tial study times or latencies? * 

3i Is there a relationship /between the difficulty of a problem as 
indexed by performance criteria and the initial move latency, 
averagf move latency, or total time taken in solving the prob- 



4i low well does the perceived difficulty of the pf 

with the actual difficulty as ind . 
. tency data-a^nd various physical attributes-of the problem^?- 

5. Which physical characteristics of the problem (e.g.^ path 

length, number of squares biit of order) are most predictive of 
various performance and latency measures? 

Cgmgu^er-Adminlstered grgbjLggs 
Pf bblgm Gharacteri st ies 

^^IMisil.oi problem difficulty, fable i sfibws^ the number of stu- 
dents who'*attegpted"'each_probrem"(including^t^ Prob- 
lem 1 ) , -the_ optima l^j3Tz :miniffla_l^ number of ^ jreqiir ed to solve each 
problem (path length) , cind the_ frequency and percentage of students who 
used vari bus numbers bf mbves before sblving_br_ giving, up working on. 
the problem^ These data suggest that most bf the. prbblems_were too 
easy ^ with from 79 iff to 9Si2% of. the students solving 9 of the 13 
problems' In^the opt imal number of , aoves_. Problems 10, 12, 13, and^ to 
a lesser extent_»_.Problem 9_ were mere chal^^^ with from 14.6% to 
45.7% of the students solving the problems In -the optimal number of_ 
^.mbves. The data in Table 1 also shbw that the optimal number of moves 
'.' was not a . perfect indicator of; difficulty. as_ indexed Jjy student perf or- 
_m^5Ai_vPibbieiS-- in_8 moves, 
were solved' in the optimal number ol moves less^:frequently ^75iS^ and 
'77.8%") than Problem -6 (87.0%)^ for which the optimal number, of D[ioves_ 
• waslO. Similarly, .Problems id and 11 could both be solved_ optimally 
in 14 moves; but Only 29.5? of the students solved Problem 10 in that 
number bf mbves, whereas 79.6% of the students solved Problem 11 in the 
optimal, number bf moves ^ • ^ 

Additional data on student performance charactefist^ the 
problems are shown in Table 2. With the excepiDion of Problems §, iOi 
124_ana 15^ the mean.:number of moves used on each problem- (row 1 of: 
Table 2) were quite close to the_minimum number of moves required for 
. its solution (row 9). Bbw 2 Of Table. 2 shows that, all students solved 

the first five problems in the allowed maximum. number of _mbves (for.the 
. Ibnger^problems the maximum number bf moves allowed was 2SJ^ and Only 
- fpr Problems i2 (66.6% solving) and 13 (66^4% solving) were there sub- 
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teBle 1 

Cpclanjm Huaber of Hsves and _t)istribuciotis of 
— gbserve d^^ unb e r of ^Saves '^ ^^^aefeT ^rdblem 



Optlfflum 

Probiea lib. of 

ytffl|S^ i^9ves — 



3 

4 



3 

a 

8 



10 



10 



12 



12 





Observed 






- 


So* of 


Frequency 




— Moves 





z 


55 


3 


54 


98.2 




5 


i 


1.8 


33 


4 


31 


93.9 




12 


1 


3,0 




13 


1 


3.0 


33 


6 


32 


97.0 




8 


1 


3.0 


54 


8 


41 


75.9 




10 


8 


14.8 




18 


1 


1.9 




21 


1 


1.9 




2.5 


2 


3.7 




26 


1 


_1.9 


54 


8 


42 


77.8 




12 


4 


7.4 




13 


1 


1.9 




14 


3 


5.6 




18 


1 


1.9 




24 


1 


1.9 




26 


2 


3.7 


54 


10 


47 


87.0 




12 


3 


5.6 




16 


1 


1.9 





27 


• 1 


~fT9^ 




32 


1 


1.9 




34 


1 


1.9 


54 


lb 


38 


70.4 




12 


2 


3.7 




16 


2 


3.7 




18 


1 


1.9 




. 20 


8 


14.8 




22 


1 


1.9 




26 


1 


1.9 




30: 


1 


l.S 


50 


12^ 


39 


78^0 




14 


1 


2.0 




20 


1 


2.0 




- 24 


1 


2.0 




25 


1 . 


2.0 




27 


1 


2.0 ' 




28 


3 


6.0 




32 


1 


2.0 




33 


i 


2.0 




35 


1 


2.0 


46 


12 


21 


45.7 




14 


5 


10.9 




16 • 


3 


6.5 




id 


1 


2.2 




20 


1 


i2.2 




22 


2 


4.3 




24 


2 


4.3 




25 


1 


2.2 




26 


2 


4.3 




27 


1 


2;2 




28 


2 


4.3 




32 


2 


4.3 




33 


i. 


2.2 




35 


1 


Z.2 




36 


i • 


2.2 



Optimum 
Sroblca So. of 
Numbpr Hbves^ — 



Observed 
No. of 



10 



14 44 



11 



14 49 



12 



16 48 



13 



16^ 49 



■ i^equency 



Moves 


N 


Z 


14 


13 


29.5 


16 


3 


6.8 


18 


7 


15i9 


20 


2 


4.5 


21 


1 


2;3 


22 


3 


6.8 


24 


1 


2.3 


26 


3 


6.8 


27 


4 


9.1 


28 


1 


2.3 


32 


1 


2.3 


34 


2 


4.5 


35 


3 


6.8 


i4 


39 


79.6 


16 


4 


8.2 


20 


1 


2.0 


22 


1 


2.0 


26 


1 


2.0 


27 " 


1 


2.0. 


IB 


1 


2.0 


32 


1 


2.0 


16 


7 


14.6 


18 


3 


6.3 








20 


5 - 


10v4- 


23 


. 1 


2.1 


24 


3 


6.3 


25 


3 


6.3 


26 


4 


8.3 


27 P' 


3 


6.3 


28 


3 


6.3 


30 


s 


10.4 


32 


4 


8.3 


33 


2 


4.2 


34 


2 , 


4.2 


35 


2 


4.2 


39 


1 


2.1 


16 


10 


20.4 


18 


1 


2.0 


19 


1 


2.0 


20 


2 


4.1 


22 


2 


4.1 


25 


t. 
4 




26 


2 


4.1 


27 


3 


6.1 


28 


8 


16.3, 


30 


4 


8.2 


32 ; 


1- 


2.0 


33 


1 


2.0 


34 


7 


14.3 


35 


3 


6.1 
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stantial ntiin'bers of Studm t b. sblvl tiie_ prbtiiems,. adv_3 re- 

ports the proportion bf students sblting eaeh prbtlem in the .optimal 
number of mbveSi Witli the exception of Prbtlems 9, 10^ Ig^^and 13, 70% 
or more bf the students were able to sQlve the rest of the problems in 
the optimal nniher of movesi 

^ • 

Sow 4_of Tahle 2 contains the Special liffictiity Index* ihich ad- . 
^usts for the differing path lengths (tniniEum iiumher of moves required)" 
bf the pro'blemsi For example^ for Problem 4 this index equaled l^El^ 
indicating that the average student required 21% more than the minimum 
hmher of moves to solve the proolemi The difficulty of each problem 
as indicated by this index agreed quite well with the performance data 
in rows 1 through 3. Again* only Problems 9* 10* 12* and 13^ with spe- 
cial indexes of 1^45_, 1.44:* 1*50* and 1 .51 * required substantial num- 
bers bf moves over the minimum number required for sblutibh. 

_A cbmparisbn^bf ^the performance index and the Special Bif^iculty 

Index with the minimum number of moves required (row 9) indicates that 
although the difficulty of the problems, tended to increase wit£ solu- 
tion path length (minimum number bf moves required)^ the relationship, 
was_nbt strictly mbnbtontc, Fbr example* although Problem 11 required 
at least 14 moves for solution* thiX.prbblem was much easier for stu- 
dents than sbme of the problems requiTfng few mqveSi Thus* minimal 
solution TDath was not the^ sole" determinant of a problem^s difficulty. 

Bill ISSliil I§te5c4gs>^ Rows 5 through? of Table 2_ show the mean- 

initial, average* and tbtal latencies bf students fbr each bf the 13 
problems .__ The data bn average^ambunt of timespeht. by students prior 
tb their first mbve (mean ihitiai latency) indicates a strong* though 
not perfect, relationship with ?he difficulty of the problem as indexed 
by the. other performance criteria^ This relationship appears eve^^^ — :r— ^ 
stronger for the total time in seconds used by the average student (row 
7). to solve problems^ of varying difficulty. _Fbr example* the fifan ini- 
tial and tbtal mbve latencies were smallest fbr_twb bf the prbblefis_ 
withthe shortest path lengths (Problems 2 and 3) and were longest fbr 
the four problems with the longest path lengths (ProbleB 9 through 
13) •The^ trend for the remaining se * 
except that students seemed, not -surprisingly, to use more time to 
study and to complete the practice problem than would be predicted_on 
the basis of its_ shbrt path length.. Students usually took about cO to 
60 seconds tb make their first mbve* whereas tbtal time wbrking bna 
single problem^ ranged from about 67 to 361 seconds^ Most pfbbleas were 
solved in about 2^5 minutes (150 seconds) or less. 

There aj>jeaxed_to be no consist eit relationship between path 
lengths bf tfi§ prbblems and the_ average, latency "fbr. the moves within a 
single problem (row 6 in Table 2). Students ^herally tbbk frbm S tb 
15 seconds to make a single mbve^ although again more time was taken on 
the practice problem (Problem l)i , 

PilSgiJii- scale values'. Row 8 of Table 2 shows the mean perceived 

difficul ty'^Iciii vlluis fbr the nine test problems that were included 
.-in the perceived difficulty rating pbttibn bf the study. Given the . 
assignment" of the numbers 1^ 2^ 3^ 4* 5* ^fld 6^ respectively* tb the 
categories lerf Sasy*. Sasy^ Somewhat Easy* Somewhat Bifflcult^ Diffi- 
cult, and tery Difficult i row 8 shows that none of the problems was 



_ . jtfje_2:__._— ^ 

Mce(i^^*l«S)S^ — — 

Perforaance Data, 4iid-BKcel»ed Scale Values for Each Problea 




J 

Ifeiii No. of Hoves 3.04 i.52 6.06 
Proportion Solving lOO.QO 100.00 100.00 



9.68 9.10 U.39 12.81 15.24 
100.00 100.00 94.50 98.20 88.00 



18.00 21.04 15.61 25.54 
84.80 86.40 98.00 66.60 



25.84 
66.40 



98.20 93.90 J.l 

-SpecM-Dif flcnlty ' -^'^T . ' _. 

IndMc 1.01 1.13 1.01 

Mean Initial Latency 53.50 .27.40 21.80 

Mean Average Utenc^ 25,80 15.30 10.?0 

feiQ Total Utency 94.30 82.30 67.40 
Perceived Jifficulty 

Solution Path Length 3 4 6 

Mean illegil Hoves .49 .82 .21 

Mean iiepeated Moves .04 .15 .03 



75.90 77.80 87.00 70.40 78,00 ■ 45.70 ;29.50 79.ffl' 14.60 20.40 



1.21 1.23 1.12 1.28 .1.24 

43.90 45.60 39.90 33.10 43.10 

14.00 15.10 12.80 8.90 9.90 

151.50 160.90 150.20 117.90 151.30 

2.20 ' 3.00 2.90 .3.60 

8 8 10 10 12 

.87 .54 .50 .63 J 

.37 .17 ,32 .04 .18 



1.45 1.44 1.11 1.50 

61.70 62.20 45.10 63.60 

14.00 13.10 9.30 11.60 

244.90 278.30 150.60 290.10 

3.91 4^5 3.08 4.31 

12 14 14 16.. 

.74 1.05 .61 1.56 

.56 1.09 .22 1.02 



1.51 
59.50 
14.50 
361.60 

3.00 
16.. 
.92 

1.40 
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considered difficult or tery Cifficuit the average students -?cur _ 
_._pro5lsiiis (9; 10; 12, and; to a lesser extent; S) were considered Some- 
what rifficult, and_tiie remaining problems (4t_6, 7> 11; and 13) Were 
perceived as_Easy_or Somewhat Easy. The difficnlty perceptions gener- 
ally indicated agreement with actual performance indices of difficulty; 
liut there were some marked exceptions. In particular, Problems 'S and 
il were terceived as heingmore difficult than was indicated by the_ 
performance _data^ whereas the most difficult problem — Problem 13 with 
Special Eifficulty Index of 1.51 (row was perceived as being Some- 
what. Easy by the average students These data indicate that students.'^ 

— -initial difficulty -peirceptioiis of: these "probiems are fallible t particu- 
larly for problems with longer solution paths ^ 

Illegal' and repeated moves ^ Sows IQ and 11 of fable 2 contain the 
:mean number''qf-^il|egai and"^ made on each problem^ These 

data indicate that student s_ made few_illegal or repeated moves (means 
less than l.d) on most of the problems and that with the exception of 
Problems lOt 12*. and 13 t there seemed to be little if any relationship 
between the dif f iculty or the minimum ntimber of moves required and the' 
number of illegal or repeated moves. For Problems 10^ 12t_and, 13* how- 
ever; the average student made approximately one or mare illegal and__ 
one or more repeated moves i This is to be expected fo^ the more diffi- 
cult problems; since the students worked longer on them and thus had a 
greater chance of jnaking typing errors and other illegal movesi It 
wauld be difficult to unconfound with anj tendency to ^$e 

more careless on the more difficult problems^ The slightly increased 
number of repeated move configurations for Problems 10t_12t_and 13 may 
be more meaningful> indicating_a greater likelihood of students needing 
to back upin their solutions, to the wqtb difficult probiems. Because 
* of . the small, number of illegal and repeated moves made by the average 
student on theseprdblemSt these measures were not _ considered- further 
aspdtential Indices of problem difficulty (e.g. » they do not appear in 
Table 3)^ • 

Helationshlps among ind difficslji. Table 3 shows 

rank-prdir^correlat^ amqng''the"pqtentiai indices problem d if ftcul^ 
ty— performance Indices^ latency measures , perceived difficulty, and 
various physical problem characteristics. Data for Variables 1 through 
S are in^Tible 2rdata for Variables 10 through 14 are in Appendix C 
for each of the problems. 

The correlations in rows 2 through i of Table 3 show that the dif- 
ficulty indices based Cfn group performance data rank ordered the diffi- 
culty of the problems quite similarly t with the strongest -agreement 

between the Special Difficulty index and the proportion of students 

solving the problem in the optimal number of moves (P=-iS5) andbetween 

. the mean number of mores used and the proportion of students solving 
the probiem_in the maximum allowed moves (p=-.S4:J. The utility of the 
Special. Difficulty Index over 'the other perfqrmanceiindices of 
difficulty il suggested byits lower correlation with sdlution path 
length (p=.77). Tor examplet using the mean number of moves required 
by the sample to solve different, problems is less adequate as an 
indicator of nrbblem difficulty because it labeled all puzzleswith 
long solution' paths as difficult Ip=.£S} when, in factf not all long 

< puzzles were difficult (e^gi^ Problem 11). 
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Tabie 3 



Difficulty Index 



1. Hean ftiyes 

2. Proportion 
— "SoKKf 

3. ?roportion 

Solvin g 



Optimally 
i Special ' 
Difficulty 



5. Hean Initial 

Latency " 

6. Mean Average 

latency 

7. Mean- Total 



8. Perceived 

Diffic ulty 

9. Solution Path 

.ieiigtb 



10. No.. of Squares 

NbtiBtching 

11. HO., of Rows 

Not Matching 

12. No. of Columns 



Difficulty Indices 




Performance. Indicea- 
12 3 4 



-.86 .75 



.85 -.81 -.95 



.65 -.63 -.67 .63 
..34 .24 ,15 -.07 
.84 -.76 -.89 .86 



.64: -.63 -.43 -J 



15. Euclidean 
Distance 

14. City-Block 

Distance 



.98 -.91 -.80 .77 

.42 -.37 -.11 . .04 

.25- -.26' -.04 .00 

.23 -.20 .10 -.08 

.62' -.60 -.36 .31 

.67 -.62 .35 



Perceived 

- Latencies Difficulty Problem Characteristics . 

5 i— 7 . 8 U-9 16 11 12_J3 



.25 

.84 .( 



.75 .00 .48 



.61 -.42 .79 

.04 -.67 .19 

.46 -.21 .17 

-.10 -.35 .13 

.22 -.68 .35 

.27 -.71 .37 



.63 

.32 .51 

.70 .34 

-.03 .30 

.38 .70 

.37 .74 



.37 

.81 .06 

.85 .34 .60 

.79 .35 .48 .98 



to^. For Row 8 only, correlations |re|ter than .64 ire statistically significant; for all other 
;-;;;^:.^-.jiB»^rt^ greater, than .51 are statlstifally significant,. • 



• _ _ The_iEterco.rrelatipns of the latency variables in rows_ (and col- 

• umiis) 5 through 7 of Tatile. 3 indicate, that only the_ffiean initial _and_ 
total _ latency_measures_ rank_ ordered the prbtiem. difficulties similarly. 
That is^ protleas which took longer times to solve were alsb_ studied _ 
longer initially (p=,S4)# hut the average time for moves within a prbh- 
lem was not significantly related to either the initial move (.P=.25) of 

i the total p rohl6m laten cy (p=.68). • 

The correlations "between the latency variahles {rows 5 through 7) 
and the performance variahles (columns 1_ t^ showj:hat the tota l t-ia:g^ 
=^pstt— on a -pisjlj-iem— |roir^)^ h highly predictive 

(p's=*84^ -•76i -•89, i86) of difficulty as indexed hy performance in-. 
dicest_ahd the amount of initial study time spent hy the average stu- 
dent (row 5) was also strongly related to the four performance indices 
(P''_s=*65, -.63t -.67, _.63)* That is, not surprisingly, sore difficult 
prdhlems were studied longer initially and took longer to solve.. The 
correlations in columns 5 to 7 of. row 9 also shew a strong relationship 
hetween mean initial and total latency and sblutibn_ path length ^p's^ 
i 61 and ^79^ respectively) ^ indicating that thepfbhlems with longer 
solution paths were studied longer initially and wbrked bn longer. 

Thecorrelations in columns i to 4 of ro^ 8 show, that students' 
perceptions of prohlem difficulties agreed somewhat, out not as much_as 
might he expected, with_ the. actual perfbrmahee measures (p's=.64^ ^•BZf 
-i43^ ilthbugh-dll these correlations were in the apprbpriate 

direction^ only the first two apprbached^statistiealsignif lean due 
to the small numhef ofprqhlems (nine) for which hoth performance and 

• perceived difficulty indices were available. _ The perceived difficulty 
scale values in row 8 of Tahle«2 suggest that this Ipwer-than-eipected 
relatlonship_was_due tb _the students inahility to differentiate the 
relative difficulties bf problems with longer solution paths istich as 

• those used in Prbhlems 9 thrbugh 13). The cbrrelatibn between per- 
ceived difficulty ahdsqlutibn path length (p=.63)_was not as high for 
the problems solved on the computer as for the larger stimulus set used 
In the rating study (r=i88), probably because the range of path lengths 
used in the- computer test was -more restricted. 

The only significantcorrelation between perceived difficulty and 
latencies (eblumhs 5 to 7) was with the mean initial latency measure 
jp=.75). In fact^ this represented the highest cqfrelatioh in the 

matrix for both var lablesi This relationship suggests that the 

problems that were studied longest before a move wasmade were 'the ones 
perceived as being most difficult (even: more than whether or not these 
problems actually were the most difficult). 

_ Examinatlbn of the cqrrelati^ in column S shows that perceived 
difficulty of the problems In the test was significantly related to 
only two physical problem. character! stics--splution path length_(p=.63) 
and number of rows npt matching in the two patterns 1p=.7C)) . Correla- 
tibns with some of the other physical problem characteristics i_e.g_. , 
the number of squares not matching andjthe Euclidean and Gity-Ilock 
distance functions 9 were probably rest r|ic ted by the reduced range bf 
values in the computerized test as opposed^to the rating study (see 
section below on dimensions of perceived difficulty). 

Sxaminatibn bf rows 9 tb 14^ cbitimns 1 to shbws that only the 
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solution path length jrow 9) , and to a lesser extent the Euclidean. and 
Clty-Biocfe distances. -frows 13 and 14); were usefui in pred 3:iffi- 
clilty as indexed "by the four performance measures of difficulty* Solu- 
tion path length rank ordered problem difficulty quite similarly to the 
four performance measures (p's=.98^ -.91, -.80,. and .77), h^ing most 
independent of the Special Difficulty In dex (.P=. 7 7j^ The two distanc e 

f unctions — pode raTeTy' p r edicted at l cin nia inher of mo yes (p 's=-.ob and -.45, 

neither significant) and the Special difficulty liidex (P's=.31 and .35, 
neither significant). 

Solution path length (row 9) was the only physical problem charac- 
teristic to predict mean initial {P=.61) or mean total (P=.79) problem 
latency^ Interestingly^ while ate rage mote latency (column 6) was not 
related to any of the performance criteria^ it was inversely related to 
three physical problem characteristics—the number of squares not 
matching (p=-.67>, the Euclidean distance function (P=-i68), and the 

Gity-Blbcfe distance fJinction (p=-.71). These negative correlations 

suggest the possibility that students worked faster and made moves more 
quickly when tfiey could see that many numbers would need to be moved, 
especially if these numbers had to be moved long distances in the puz- 
rle. 

The intercorrelations of the physical problem characteristics in, 
rows and columns 9 to 15 show that the more highly related problem 
characteristics were solution path length, the number of squares not 

matching, and the two distance functions. For this set of problems, 

the Euclidean and City-Block distances were virtually identical iQ=,9eU 
Although the number of rows not matching did not -relate to other physi- 
cal problem characteristics;^ the number of columns not matching did 
correlate with the number of squares not matching (P=.8l) and the Eu- 
clidean distance measure {P=. 60) * Whether the number of rows or col- 
umns not matching was more or^qually related to other physical indi- 
ces, however, is strictly dependent on the particular set of problem 
replications used. 

Assessment of Individual Student Iepfo£sase§ 

Scorisfi iethods* For each individual problem two scores .were com- 
puted — Score iT'defined as the number of moves the s'^udent required 
divided by the minimum number required, and Score 2^ defined as Score 1 

divided by (corrected for) the Special Difficulty Indexi Four total 

scores vere also derived— Total 1 and Total 2 were the averages over 
the problems attempted of Score 1 and Score 2, respectively, and PROPS 
and ffiOPM were the prepbrtion of problems attempted that were solved 
within the maximum allowed moves (PSGPS) and in the minimum number of 
mcves (PBOPM). Table 4 shows the meahsj standard deviations, and range 
of all these scores for the present samples 

Note that aithough not all students workel^ on each Individual 
■ prbbiem* thus hot havi-ng a score (Score 1, Score 2) for each problem; 
the four total scores were obtainable for all students (N - 55) as a 
result of the way these scores were defined. PROPS and PROPM can be 
considered additive scores^ which essentially total the number of prob- 
lems solved or solved optimally; whereas Tdfcal i and especi^ally Total 2 
V take_into account the pattern of scores across the prbblemsjattempted. 
The latter two scores would appear to be particularly appropriate for 

o - ' , ■ ^ 
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Table 4 

Meaiii Standard^ Peviatign» and "Bange of Four Total 
Scores and Thirteen Individual PrOblein Scores 



PSDPS 

Tot^-t" 
Total 2 
Score 1 



Score 2 



N tTpflTi 



Staxidard 
Deviation 




Poorest 
Score . 




1 
2 
3 
4 
5 
6 
7 
8 
9 

id 

ii 
12 
X3 

1 
2 
3 
4 
5 
6 
7 
8 
9 
10 
11 
12 
13 



55 
33 
33 
54 
54 
54 
54 
50 
46 
44 
49 
48 
49 

55 
33 
33 
54 
54 
54 
54 
50 
46 
44 
49 
48 
49 



i.Ol 
1.13 
l.bl 
1.21. 
1.23 
1.12 
1.28 
1.24 
1.45 
1.44 
i.ii 
1.55 
l.Sl 

i.go 

1.00 
1.00 
1.00 
1.00 
1.00 
i.OO 
1.00 
1.00 
1.00 
1.00 
1.00 
1.00 



.09 
.52 
.06 
.56 
.55 
.41 
.49 
.49 
.54 
.40 
.27 
,29 
.31 

.09 
.08 
.06 
.46 
.44 
.37 
.38 
.40 
.37 
.28 
.25 
.19 
.20 



1.00 
1.00 
1.00 
1.00 
1.00 
1.00 
1.00 
1.00 
1.00 
1.00 
1.00 
1.00 
1.00 

.99 
.89 
.?? 
.83 
.81 
.89 
.78 
.81 
ii9 
.69 
.90 
.67 
.66 



1.67 
3.25 
1.33 
3.25 
3.25 
2.80 
2.80 
2.33 
2.33 
2.00 
2.00 
1.75 
1.75 

1.65 
2.88 
1.32 
2.69 
2.64 
2.50 
2.19 
1.88 
1.61 
1.39 
1.80 
1.17 
1.16 



^dte that higher nunbers represent better scores for the PROPS 
and PRSPH scores and lower mnnbers reflect better scores for the 
Total 1 and Total 2 scores. 



adaptiTe testing w6ere not all students wort on tlie same problems. i 

t can seen that the average student 
nted in the iaiimua alldwahie moves, 
the problems attempted (best score = 
poorest score (^50) sblted only half of 
?ii data indicate " that; the aterage stu- 
attempted in the optimal number of • 
from 100^ to 36% solved optimally. The 
e average.^ student required 25% (mean = 
required to solve the average problem. 



From the mean flOfS score i 
solved 63% of the problems a tteo 
At least one student solved all 
liOO)* and the student with the 
the problems :at tempted. The PEG 
dent selted 66% of the problems 
motes i With proportions ranging 
Total i mean scbri shows that th 
ii25) more motes than optimally 
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4t least one studest averaged 70% Sbrs_indtes . than . required (poorest 
score = 1.70)^ and dn'e sbited all.prbbieas attempted in tne miniaum 
number of moves (best score i^OO). 



) t a 1. Iscbi;e_jiepxs.sj2.n^^ 



minimum number possible, and tbe Total 2 score represents the prop or- 
tion of m oves gre ater or-J^ess tSan tS e--me-an^ nnTpJ)^4^-^^e^^^^^e<F^- t 
-g-r oup db d w hol e, "^hls is aiso true^for the difference between tlie two 
individual problem scores. Score i .and Score 2. fhus, by definition i 
tie- mean Total 2_ score and mean -Score_2 equal 1*00 • The best Total 2 
score was .84, indicatingan average problem solution of 16% fever 
moves than the group norm; whereas thi_poorest Total 2 score was i.38, 
indicating that bne_ student required 38% more moves on the average than 
did the average student in the group. 

Ey definition, the mean Score 1 for each problem will be_ equal to 
the Special Biff iculty Index (i ^e^, mean number of moves required by 
the sample divided by optimal number of moves).. However, the data in 



Table 5 

Ratings and Mean Sating (iffiA^) of 
TbtaJ^ ^st Performance for Ea c h 



Student 




Jxidge 




- 

MRATE 


■ 

Student 




Judge 






1 


2 


3 


1 

X - 


Z 


— ^ 


MP ATT? 


1 


6 


6 


7 


6.3 


3b 


6 


4 


4 


4.7 


2 


7 


5 


6 


6.0 


31 


6 


5 


7 


6.0 


3 


6 


6 


7 


6.3 


32 


3 


3 


3 


3.0 


4 


4 


4 


5 


4.3 


33 


2 


3 


3 


2.7 


5 


5 


5 


5 


5.0 


34 


7 


8 


6 


7;b 


6 


7 


7 


8 


7.3 


35 


8 


6 


7 


7;0 


7 


6 


6 


8 


6.7 


36 


5 


4 


7 


5.3 


8 


4 


3 


g 


4.3 . 


37 


2 


3 


6 


3.7 


9 


7 


5 


7 


6.3 


38 


7 


6 


2 


5.0 


lb 


8 


6 


8 


7.7 


39 


6 


4 


5 


5.0 


n 


4 


3 


6 


4.3 


40 


5 


5 


5 


5.0 


12 


' 8 


8 


8 


. 8.0 


41 


5 


5 


6 


5.3 


13 


5 


5 


6 


5.3 


42 


4 


5 


6 


5.0 


14 


4 


. 4 


5 


4.3 


43 


9 


8 


9 


8.7 


15 


2 


2 


3 


2.0 


44 


4 


3 


2 


3.0 


16 


3 


3 


4 


3.3 


45 


5: 


4 


6 


5.0 


17 ; 


8 


8 


8 


8.0 


46 


7 


7 


8 


7.3 


18 


7 


5 


6 


6.0 


47 


5 


5 


5 


5.0 


19 


6 


5 


5 


5.3 


48 


5 


5 


6 


5.3 


26 


5 


5 


5 


5.0 


49 


3 


3 


3 ' 


3.0 


21 


4 


5 


5 


4.7 . 


50 


^ 8 


8 


8 


8.0 


22 


3 


3 


3 


3.0 ' 


51 


7 


6 


7 


6.7 


23 


8 


6 


• 7 


7.0 


52 


2 


2 


1 


1.7 


24 


6 


4 


5 


5.0 


53 


7 


5 


8 


6.7 


25 


6 


. 5 


5 


5;3 


. 54 


2 


3 


2 


2.3 


26 


3 


3 


3 


3.0 


.55 


1 


2 


2 


1.7 


27 


8 


8 


7 


- 7.7 


. ' Mean 


5.3 


4.8 


5.4 


5.2 


28 


3 


3 


3 


3.0 


SD 


2;0 


i.7 




1.8 


29 


6 


5 


5 


5.3 













; ."^Tatle 4 slibw differing leteis of _dif f ieulty for the 13. prblileais as in- 

4axe4_ty Score t*; For example^ Frotlem 9 (mean Score i = 1*45) was 
~ - Bore difficult for the sample than Frohiem 4 (mean Score i = xi21)^ 
. since jProhiem B required an^verage of 451 moi^ m oves tha n^tfrp fnTt+Tv^-i- 

nTim>ipr--p^jr_<m .<;^-S^H nrp mn v^ ^ t -h-arr— fetF^npTi inta t nttm"6pr for Pro blem 4^ 



— -g- eore gy_ like '". Total 2y_indexes_perfgrmance_ relative, to the mean 
student* is a result ^ the mean Score 2 across all students is 1.00 for 
each prohles hy definition* 7aiues_ of Score 2 helow l^OOindicate 

fewer moves than the aTerage student ,_and scores greater thanl.bp 

reflect more moves than the average student* Sxamination, of the hest 
and poorest values of Score 2 ihdicate_cohsiderahle vafiahility^ : 
student performance on the prbM Thehest student on Prohlem 13 

completed the prbhlea in tvb^thirds of the average ntimher cf_moyiS 
required "by the average student , and the poorest student on Prbhlem 2 
required 2^88 times the , average numher of moves* 

Ju d ges ^ ratings yer fofmance^ Table 5 contains the ratings on a 
10-*point scale of each student^i"pveralL_test performance hy three inde- 
pendent judges and the resulting mean rating (MEiTE) used as a criteri- 
on in this study, against which to compa!re the alternative scoring meth- 
ods* -The me^n and standard deviation if each ^udge'^s ratings and the_ 
overall mean ratings are also shown* The means of each column were all 
close to SiOtWhich-is approffiatetSihce the judges were instructed to 
assign a. rating of 5*0 to students with average performance^* The simi- 
lar standard deviations indicate a cbmparahle spread of judgments hy 
each judge* For only 6 of 55 students did any two judges differ hy 
more thari 2 in their assigned ^rat^ 6 students 4 were in- 

consistent in that they performed either very well on most prbhlems and 
very poarlyon a few (Students? andli) or well on some difficult 
prohieas ^hut less well on easier ones (Students 37^^^^^ 53) . One of_the 
students (Student 36) did not ha v^ data for three problems on an impor- 
tant part of the test ^ mafcing it difficult to evaluate that student's 
overall performance on the test* 

Tahle 6 §hows the results of the interrater reliability analysis*. 
As Table 6 shows, most of the variance in ratings was due to individual 
diff 'fences. in student performance^ and. substantial interrater reli- 
ability (P£* *8d) was obtained. 

Table 6 



Sottrci^,^ of Variance 


In Pfvrf orrnan 






Sources ot Variance — ^ 

Students (a) 




_502.5 


9^ 


Judges (5) : 




16.6 


5.3 


Error (a x 


108' 


75.4 


.7 



Sglationsht^ 'betyeen jgdggs^ ratlags and sebrlag m etfebds . Ta"ble 7 
sliows~tlie~Speirman rank-order coifficients 'between each: of t&e indtvid- 

• till total perforaance scores (PSOPS, PHOPM, Total\l, and Total 2) and 
HBATE.^ Is terms of its relationship with the pther^scoring methods and 
HHATE^ PROPS was clgarly the least adequate total sco^re. This is not 
stirprisiiig* since this scoring method does not use important inf oTfa- 

• tibn on the differential humh^r of moves that are lessxthan the maximum 
; allowed* The highest relationship hetweeh scores was hgtween Total 1 
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aad Total 2 (r _= .SSjj these scores undbubtetily are so similar in tfeis 
study because tHe test Was ibt adaptite* Mo.st .students, attempted, the _ 
same, p.rb'bleas ^ so tHat the _Tbtal 2 .adjus taeat . iot the jiif f i^culty J .exel:::::^. 
bf tie p^ohle ms att_g.atg.t:ejL-did nJ04^4A-'^^ren:t-i:s^ SiSTween .students . _ _ Is . 
'^aii^lt:a:ptive test where students converged on pfbbleins bf varying diffi- 
culty levels t perfbraaace as indexed by fotal i and Total 2 would be^ 
expected ta differ appreciably • 



_ Table 7 

Spearmaai Rack-Order Cdrrelatidns Betwe^ Individual 



Tatal Performariee Scores 


and Mean 


Perfbrgiance Ratines 


Score 


PROPS 




Total 1 


Total 2 


PROPS 










PR0PM 


.71 








Total 1 


-.79 


-.88 






Total 2 


-.74 


-.81 


.96 




MRATE 


.68 


.87 


-.85 


-.89 



Note. All Spearman coefficients significant at p <^ .001. 



il though t&e correlation of these two scores was high, examination 
qf the_ students^, who were classified as the best performers by each_ 
score showed that theydid evaluate_performance differently • _The top 
Id students on each score were essentially the same group, with the 
exception of tliree. students who had the top. three Total 1 scores but 
ranged 1$_ through 16 on Total. 2 scores. All three of _ these. studihts. 
worked. only on the easier problems. and solved them all in close to the 
optimal number of moves? as a result # their Total 1 scores vrere highi 
However^ many students who did well. oh the more difficult problems re- 
ceived higher Total 2 scores as well^ because such scores take into 
account the difficulty level of problems attempted. 

If the judges' ratings, which examined each protocol in a more 
comprehensive way, were .used as a_ criterion against which to .evaluate 
the_dif ferent scoring methods. Total 2 was slightly but not signifi- 
cantly better than the PSOPM and Total 1 scores. The* judges , in de- 
scribing how they .made their ratings^, were clearly taking into, account 
not Only the number .of .moves beyohd the. optissl number iTotal l) but 
also the relative difficulty of the problems atteEpted by each student; 
therefore, if students had worked bh problems of more varied difficulty 
levels. Total 2, which takes both these factors into account, would 
seem to be even more superior to ?BQPM and Total 1^ 

Con§tst§sg£ gggf g^ mancg across groblgm^. Important for the 
u_sefuIness"'of~his'*frpbIem~typi in^iisessing spatial^problem-solving 
ability is whether reliable individual differences on various perfor- 
mance criteria.can be iden tified_across -problem replications af similar 
and. varying difficulty levels. Table.S shows the intercprrelatioas of 
the total number of moves used by.students (lower triangle) and the. 
ihtercofrelaliohs_ of - the number of ill.egal; moves made (upper triangle^ 
across the 13 problem replications* The correlations in the. Tower . half 
of Tablets fail to demonstrate strong. consisiehcy of the Number of . 
Moves performance measure across problems. That is^ there was -not a 
consistent tendency for studesis to rank order themselves similarly 
across problems on .this performance scores Some small clusters of sta- 
tistically significant and moderate size correlations existed between 
Problems 2 thro,ugh 4» Problems 5 thrpugh 10, and to a lesser extent 
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— IntercorrelStlons of Nunber of Illegal yitoves (Upper Triangle) , and 
' TStal Nuiiflier of tfoves (Lo^et Trlattglfey for Peobl^fls 

Problem - - _ • ■ 



S4 
-.04 

32 
.48 

32 
.16 

SB 
.41 

S3 
.21 

S3 
.20 



rbblem 




1 / 


2 


3 


4 


5 


6 










33 


S4 


S4 


54 


1 


r 




<^ .41 


.13 


,14 


,07 


-.23 




ir 


33 




33 


33 


32 


32' 


2 


r 


-.04 




.08 


,35 


.48 


.13 




if 


33 


33 




33 


32 


32 


3 


" If 


-L63 


i74 




- 15 


-i20 


-20 


4 




S4 


33 


33 




53 


53 


r 


-.05 


.41 


.60 




.24 


.06 




N 




32 


32 


S3 




54 


5 


r 


-.06 


02 


-.08 


.25 




.24 


6 




5i 


32 


32 


S3 


54 




r 


-.04 


-.09 


-.06 


-.04 


.21 






iir 


Si 


32 


32 


53 


53 


53 


7 


r 


.21 


.16 


-ill 


.13 


.34 


.32 


8 


N 


55 


3d 


35 


49 


4d 


4d 


r 


-.07 


-.12 


-.08 


.18 


.38 


.10 


9 


i? 


4^ 


28 


28 




45 


45 


r 


-.12 


-.22 


-.16 


-.05 


.10 


.35 


10 


N 


44 


28 


28 


43 


: 43 


43 


r 


.22 


-.05 


-.11 


.20 


.03 


.30 ' 


li 


ii^ 




2B 


28 


. 48 


., 48 


48 


r 


.02 


.18 


-.10 


.17 


.18 


=.10' 


12 


if 


4d 


28 


28 
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Ijetween Protlecs 9, iOt 12, and 13 i These moderate pdsitiye correla- 
tions, which tend to "be located near the diagonal, suggest that ai- 

though individual .differences as indexed hy total, number of moves were 
not. very consistent- for the particular set of prohleDS used here ^_ con- 
sistency of _ perf orma.nce^ was fflore lively to he dhtain^ across prehlems 
of more similar difficulty levels^ 

4 pfdhahle reason for the lack of consistent performance across 
prohlems is the small variationin performance for most of the prohlems 
due tc the overall easiness of thetest* With_ the majority of students 
solving many problems _ in the .minimal or clos^ to minimal nutnher of 
mbvest the lowvariahility of the performance scores across prohlems 
would greatly decrease correlations* 

Similariyt there was not a strong tendency for the same students 
to make more illegal moves across prohlems, as indicated in the upper_ 
half of fahle 8* However, many more moderate and statistically signif- 
icant correlatioiis. elisted than would_he_ expect ed hy chance. It was 
originally expected that the num her of illegal mbvesmight relate, to 
difficulty in understanding the instructions and prbhlem task^ The 
small numher of illegal moves made hy students on most prdhlecs (see 
Tahle 2)^ however, not only decreased the_ likelihood, of large cqrrela-_ 
tlons across prohlems hut also suggest ed that the moderate correlations 
that did . appear were due more to carelessness on the part of some stu- 
dents in^/ entering their responses on the CHT. 

Fr^m Tahle E it was also- seen that there were very few repeated 
moves made hy students, indicative of hacking up in the prol^lem- solu- 
tibhr ffot surprisingly^ then^ no strong cbhsistehcy across prohlems 
was found for this performance index (see correlation matrix in Appen- 
dix Tahile 

Tq examine the relationship hetwees the aumher bf legal moves 
used^ the numher of illegal moves^ and. the numher of repeated moves 
within a single prehlem all d across prohlems ; the in tercorrelat ion ma- 
trices /between these performance indices were computed (see Appendix 
Tahles F-3^_ and_F-4) • If all three indices, were related to ahili- 

ty to fsolve these prohlems, they should he related to each other within 
and ac/rbss prbhlemSr . Examinatibn_ bf the int ercbrrelatibn matrices dem- 
ohstratted that the numher of tbtal^ illegal^ br repeated moves bn the 
same or oh a different prbhlem were not highly correlated^ with the. 
exception, that _ within the_ same_ prphlem_the numher of repeat€d moves j 
correlated moderately highly (average r = •^Q) with the numher of total 
moves ( Si3e_ Appendix Tahle 1^3). This latter relationship is not_sur- 
prising^ since it is apart^whole corf elation, with the numher of re^ 
peated mbves heing included in the tbtai numher bf mbves. 

Another way to examine^^^c^ of performance is to relate 

performance on individual prohlems with performance, on the test as a 
whale, as indexed hy various tiDtal scores. These "item-tptal** correla- 
tions^ shown in Tahle cahVassist in selecting' the. prbhlems that are 
most discriminating^ In Tahle. § the five or six highest cbrrelatibns 
in each row are underlined* These data indicate that generally proh- 
lems in the middle range of difficulty (Prohlem most 
discriminating* Since correla\tion:s hetween individual prohlem scores _ 
and the four alternative total scores are to varying degrees part-whole 



correlations^ tire last two rows of Ta'bie 9 siiow t6e correlations lie- 
tireen a prol)iem score and the total score- on tie- remaining problems., 
using the two total scores discussed earlier as being the_ most prdttis- 
Ing Tfotal_l, Total 2)* Sonsidering that the pro^jleia-excluded tbtal__ 
scores consist of only 12 "items" and that the easiest and abst diffi- 
cult problems.were not very discriminating, some of the correlattbhs 
are encouraging.__The_data_sugsest that if several problems can-"be 
tailored to the same difficulty level (see discussion of Table 3 a- 
bdve)* one appropriate for each individual student, improved reiiabili^ 
'ty may be obtained- - 

lMim9 

(Score Score 2^) and SevCTai total test Scores, by Prbblea - _ 
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If \r\_> .36* p < ^01; lf_lr| > ^27^ ^ < -05. " 
^izice Score 1 axid Scbr^ 2 were linear transfdtinatldns of each other, correlations with 
total scbrea vere identical. 



R^s ggusf latencies 

Cdnsistencj of latenc y measures §^Sii*^l£2ii§l§* Table_ip shows 
the intlrSdrrilatidiis of iiii tial'risponse latencies"!; lower triangle) 
and average response . latencies _J.u5jjer triangle) across all 13_ problems* 
The initiai latency correlations shoved a moderate /to strong tendency 
for individuals to be consistent in the amount of time they spent in 
initial_study of a problem prior to their first movei- There was ap 
even stronger tendency for the average time per move to_be cdxisistent 
across problems, with most of the cbrrelatiorts in the .20 to .50 range 
and many in the .^O to .SCf range* 

Table 11 shows the intercorrelations o| the total time spent on 
each- problem* These data indicate a moderate relationship across most 
problems i 

Thusi there seemed^ to be a substantial- degree of consistency. in 
the initial^ average, a.nd total time taken by individual_s_in wording on 
these problems. The^response latency measures may tap differences _ln 
the cognitive style of reflectivity versus impulsiveness (Kagan^ 1955; 
Kagan et al*, 1S64) or the degree of planning by the student. Since _ 
all three cprrelatipn ma trices ( initial, averaget _and total latencies) 
showed a slight tendency for the_correlatiqnsto:be largest near the 
diagonal^ the work strategyor style of _each_student may vary so^ 
at different points in the test being more_consistent for problems 
that are worked on closer to each other in time. - - 

■ _ . . J 

Tfie response latency measures may also reflect indiTidual differ- 
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eices in tHe speed of isatial inf pTmation processings which, in this _ 
case represents the efficiency with which a seq^uence of mpyes can 'be,^ 
traced out rlsiially and maintained in memory. Such differences may br 
may not show up in the performance measures ♦ since students may compen-- 
sate for slower, information processing spee^^^ with more care and slower 
response latencies* 

M jincy trends ^ Tigure 2 shows plots of response latencies in _ 
seconds (vertical axis) versus the numhered moves (.horizontal axis) for 
sampled problems for two ^students who performed very well on the test, 
as a wholt and for twostudentswhoperformedpoqrly^ hased on 
-in each graph an * indicates where the_ plot would have ended had the 
urbhiem heeh solved in the bjitimal nurnher of moves. Graphs which con- 
tinue heybnd the 27th move at the right end of the horizontal axis were 
hot solved hy the stuient* 

The graphs shown here suggest that gbbd prbblem solvers (Students 
A and t) had largjer tnitial study times for Hbve 1^ Although this, 
seemed to he the*'case for some of the good prohlem solvers^ typical 
initial study times for other good ..prohlem solvers indicated that this 
was hot a consistent trend. _Mpst of the latency graphs examined did 
seem 16 te characterized as follows: 

1. Seneraiiyt initial latencies were longer than the latencies 
for subsequent mbves^ _ ^ 

••2i '"Spikes" in the graphs frequently occurred. every several 

' moves t indicating that the student was restudying the prohlem 
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on Judges' Ferforiiaace Rltliigs (HHATE) 



Stiideiit It Good Perf dmaiice 



Student C: Poor Perfbriiiance 





I I ) 



Student B: Good Perfonoance 



Student D: Poor Perfbriance 
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and/or eta Ilia ting his or ier_ progress * iltltbiigfiiibi analyzed 
systelaticalljt some stiident^s graphs (e«g*t Sttideht £ in 
Figure. S) seei to ^e characterized "by higher spiSiss than 
ithers*-- - : - - _ 
3* Tbr^prb^leas that were sbltedt latencies jtypically dropped to 
I to 4 seconds for the l^st 3 to 6 mo res indicating that the 
solution path ted heen discoreredi This f may he con- 
sistent with short-term memory capacity researcht which indi- 
cates that somewhat fewer than seven ''chunks** of information 
can_he_maintainedin_ short-term memorywhile other cognitive, 
resotxrces are heing allocated simultaneously (Sintsch^ 1977^ 
p- 199)-; - - - - 
i. Poorly solved prohlems often showed aconsnicuousahsence of 
spikes or .restsdy. points* In Tigure 2 Prbhlems 10 and 12 for 
Student C and Projblaa 8 for Student f exemplified this* poi^^ 
Qn the other hand, there were prohlems solved poorly which did 
contain sptSes or restudy- pot-nts (e^g^t Prohiem 13 for Student 
C}> indicting that the student was trying to get hack on the 
right tracks. 

dverail* some trends were suggestive # hut they were hy no taeths 
universal* Although. perhaps prbvidihg cluis to the work styles of some 
students (e«g*f impulsive responding with fewt if any # study points}^ 
the latency trends appeared to he too idiosyncratic to he very useful 
from a psychometric point af view* 

'gelatlbttshlu hetweeh Perfb riah ce and Response Eaten cles 

ffie correlations "between moves students used and the 

initial and average move la te^ problem indicated no rela- 

tipnship between these latency, measures and performs with a single 
problem* Similarly^ when_ initial and mean latencies^for each pjrpblem 
were correiated with total :SGores (Total 2) and.MHATE^ no significant ■ 
correlations were found (see Appendix Tables and F^). 

Hot sufprtsihglyt problems that were hot solved well took longer 
than problems solved w^ll^ as indicated by the first row of Table i2t 
which shows the cor relit ion of total time spent ^n each probl^ 
thenumber of moves needed (and, hencei_ the individual problem scores 
Scpreiand Score 2)* This relationship held for a|l ^problems except 
Problems it 3, and 12; comparison with the difficulty index in Table 2 



Table 12 _ 

Prbdttct^'-HbQeiie Cbrrelatloxu Betvem To T^arSp^^ on Each Problem 
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*Sta^itlcaily dlffnttt frpa zero at p£,^Q5* 
;^^^tati^^ frbm zero at p^< •01«- 
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shews i£at_.the_relatiensliip was strongest . for jrolilgms_ of middle diffi- 
culty lerelsj Problems 4^ 5^ 8# and !!)• SHen total prdolem time- 
for each prohiem was correlated with stuxtents' total test performance^ 
as indexed hy Total 2 and MSATE^ these same problems related most high- • 
ly. 

• - _ These ;d-ata indicate^ that with the exception, of the total latency, 
of timet spent on a prohiem, the response latencies did not show any 
consistent meaningful relationship to performance. 

gotlyattonal and Biographical !§ ta . 

Table 13 shows the frequency and _ percentage of _ stiidents endotsing 
various. response alternatiyes to questions about prior experience, per- 
ceived difficulty^ motivation levels and self-evaluation.. Regard ing___ 
prior experience with this problem type. Question 1 indicates _ that 40% 
of the students had never worked bh this problem type^ 58% had dose so 
a few 'times, and only 2% had worked such problems many times. 

Eescribing how students-are to solve these problems and enter 
their moves in a sequence of cdaputerized.instructions has certain ;;dif- 
ficultiesi but the fesposseste Question 2 indicate that nearlyair 
students had little of so difficulty un^ the in struct ions. " : 

Most students thought half or more than half of the problems were 
rather easy (Question 3l,_wefe not at^ all or only slightly nervous. 

(Ouestion_6) , and eithef en joy ed_ wording on the problems or were 

: neutral about : it_ (Question 8)* Res^ais^ to Question ^ suggest that 

the instfuc^tbnal; sequence and expe^rimental conditions _did not succeed 
in mdtivatffi:g'^ fbst stud^ to try' hard to solve all of the puzzles in 
as_few m^ves as possible-i This less than optimal motivation under 
conditions where the test has no partlbular importance to the student 
is probably m6re'_ of a problem for tests of this type than for more . 
traditional psVchbmetflc measures, since each item or problem requires 
more perseverance. . - 

It is difficult: to say how much the scbfes in this study were af- 
. fected by some students being less/cohcerhed about optimal performance. 
However^ to examine this que'stion with the data availa-ble^ the mean 
total score (Total. 2) and MHATE of students responding to Question 4 
' with "^a" (mean^Total 2^= ^96, mean MEATS_= .5.59) , _''b'' (mean Total 2 = 
l402# mean WRITE =_4*93l, and /- c" _ (mean_To:tal 2 = 1.03^ mean MEATE = 
4i99) were cbmpafed and no significant diffefences found. . • 

. Question 5 ir:'icates that about half of the students tbought the 

length of the test affected their motivation. _ Finally, 56% of the stu- 
- dents thought they did fairly well, on the test, 30% thought they did 
hot do very well ^ and_10% ha^ no- idea how well they haddone (Question 
7). For future research with this type of test^^ it would be of inter- 
est_to have the computer ask some of these questions dixring actual 
testing so that students' motivation, anxiety^ difficulty perception^ 
tnd confidence could be related to the simultaneous quality- of their 
solutions . 

It is important to know to whatextent a testmeasures priofex- 
periehce with the assigned tasks* Differences in_test performance^ due 
to prior experience- may be desiratble cf undesirable depending on the 

erIc : ; \ 
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TaBle 13 

Ststrtbutiiona of Responses to Puzzle Reaction Qiiestlbns Cff^SO) 



i Qaestlon N Z 

!• Before today, bow often have you "s^orked on this kind of puzzle? 

a* ne^ex 20 46 

b. a f ew tiiaes , 29 58 

c. toany times ±2 

2. How much difficulty did you have understanding the instructions before 
starting the puzzles? 

a. w difficulty 39 78 

b. a little difficulty 10 20 
c» TOCh difficulty i « 12 



: 3* Which of the following best describes how difficult you titought the 
puzzles were? , 

a. All of the poza^es were easy ^ ^ 

b. A few puzzles wre difficult ^^e r«t w«e^rath 27 54 
c/ About half the puzzies^wexe easy acrf half TCr^ IS 30 
di A f cw^puzzies were ^ray^ Se rest were rather difficult 5 10 
e. All of the puzzles were difficult 0 0 

4. Which of ^e fbilo^^g best d^cribes your attitude towards completing 

the puzzles? ^ ' _ ■ -- -- _-_ 

a. t tried hard to solve all puzzles in as few moves as possible 18 36«7 

bi t tried hard to solve most but not all of the puzzles in as _ _ 

few moves as possible 19 38.8 
Ci t ttied to solve the puzzles^ but was not very concerned about 

using as 'few moves, as possible _ _ 12 24.5 

d. I didn't care xraether I sdlvei the puzzles or not 0 0 

5. Did the length of the test affect foxix motivation? 

a. not at all 19 38 

b. somewhat. , 2^ 52 

c. quite a bit 5» 10 

6. W^re you iervbus or uncomfortable \rtiile worlciiig on the puzzles? 

a* not at all . 33 66 

b. soB^rfhat • ^^^^ 

c. very much so 0 0 



7. How ^^11 do you think ydtx did on the puzzles? 

* a. very well 2 4 

b. 'fairly well__ : 28 56 

c. not very well * IS 30 

d. I don't really know 5 10 

8. How did you feel about working on the puzzles? 

a. I dislikes! it a lot 3 6 

b. I dislikel it somewhat . ^ ^ 
c* r feit neutral about it . 11 22 
d. I enjoy^ it somewhat " IS 52 

e. I enjoyei it a lot . :' ^ 12 
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test applications In t&is study ^ a general nreasure of spatial reason- 
ing al)ility was sougfct so that per scores would sot iDe signifi- 
cantly: determined, "by prior experience with any specific spatial tasici 
A comparison of the mean Total 2 score { .SSJ and performance ratings- 
(5..55J for the 20 students who reported no prior experience with this 
prdhlem type.and of the mean Total 2 score (loO^) and performance rat- 
ings (4^73) for the 30 students who reported hating worked such proh- 
lems a few or many times jsee ^^estibn 1) sho^ ho significant perfor- 
mance" differences hased on stated prior experience. Similarly, a eom- 
parispn of male and female mean Total 2 scores (l,Od versus l.OGj and 
mean ratings (5*d9 versus 5.25) also showed no statistically signifi- 
cant differences. ♦ 

Perceived difficulty Ratings 

Mmen siohs of Perce ive d difficulty 

Tahle i4 shows the proportion of students reporting voluntary use 
df_ various rating dimensions in tieir_ protocols while sorting the stim- 
uli and the proportion of students" selecting each dimension from a pf e- . 
pared list of ^dimehsio^ "by the experimenter after the sorting- 

was completed (see Appendix B for the rat ing^hooklet ) i The last column 
in fahle 14 shows the percentage distrihut ion of frequencies with which' 
each of _ the- dimensions In the prepared list was used. . Tahle 14 shows 
that ail dimensions were fepdfted less frequently in the free response 
voluntary prdtdcdl sittiatidn than'when the prepared list was. used. 
Thi? would "be expected^ since some. students might hot have thought to • ; 
report a dimension they might recall using when prompted latere How- 
ever, the large discrepancy, hetween these two columns for Bim^hsions h 
(numher of columns not matciing) and i (numher of_rows_not ma 
would suggest that thesetwo dimensions were not very salient^ despite 
the high proportion of students endorsing these dimensions post hoc. 
The humher of students endorsing the supposedly irrelevant Dimensions 
and 1 under the prepared list cdhdit ions, compared to the hear ahsehce 
of these dimehsibhs ih;the volunteered protocols^ further suggests that 
samethinglik€ social desirahility responding was occuring in the pre- 
pared list condition. 



in examihatioh of the percehtagedistrihutidh data in the last - 
column indicates these less relevant dimensions were most often report- 
ed as heing used in Some or None of the prbhlemSi It seems lifeely that 
if students_endorsed_prepared dimensions that had not actually heeh^ 
used or that were not_the most salient ,_they would endorse ^the Some 
category rather than the All or Most categories. On the other hand^ 
the dimensions reported as heihg used most often in the voluntary pro- 
tocols were^ with the exception of _ Dimensidhc^ _ endorsed mdst heavily 
in the All or Most categories in the prepared list. Thust the data 
from the voluntary prctocqls^ in conjunction with the All and Most cat- 
egories in the frequency ra-tings, would seem to he the hest indicators 
of _ the most ^salient rating dimensions that students thought they were 
using. 

From Tahle 14 it is clear that the most salient rating dimension 
was iimension a, the numher_of moves fequir-ed to solve the puzzle _ 
(i'.eit the solutipn^path length),,. Ninety-three percent of the students 
voluntarily reported this dimension in some form in, their protocols ♦ 
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1 Table 14 

. Jinetisibns-Us^ lii^^ai;^^^ 

Perceatage of Studmts . , _Perceatage of flme 

Report:±ng Using the Students^ported Us±^g 

Dlmensioti on at Least Some the Dlmenslog is Rating 

of the Probl e m s — the. ProhlP m s 
Voluntary Prepared 

p-f TnAT|gioii Prbtbcbls List - All ' Most Sbtne Non e 

a* The nunber or 

e^lication of moves 93 98 35 35 28 2 

b • Whether can "see" 

_ solution 68 100 58 28 14 0 

c* Number of sifuares ■ 

not natctiing 58 91 ; 26 30 35 9 

d. Amount of time to _ __ _ : _ 

^l^e 50 93 53 26 14 7 

e* Types of moves 

required J 50 ' — • - — • 

f • How far apart certain 

_ nmnbers were 43 86 14 30 42 14 

g« Eow mixch thought 

require! _ ' 32 - • T • • 

h. The number of columns 

^t matching 18 72 19 23 30 , 28 

ii ^^er of rows _ _ _ 

not uSt^^ 11 81 21, 26 35 19 

j • Location of empty _ _ _ 

space in left patt^ .7 63 16 19 28 37 

Similarity to already 

. solved puzzle ^ 2 - - - • • 

!• miether one pattemi 
was in nrithgric brdef 

from 1 to. 15 0 39 2 2 35 60 

Note* Kissing entries are fbr dimensibns not included in the prepared list but 
reports by some students iii their vbliintary protocols. 



and rirtuaily all stndents (S8%) selected the dimensios in the prepared 
lis.t condition i Tite^qther^ffipst salient dimensibns were Biiensibn b^ 
whether .the student could^^see'' the solntio^ Simensibn the 

number of squares not matching in the two patterns? Mmension d, the 
time the student felt it would take to solve the puzzle; Dimension e* 
the type or nature of mores required; and Dimension ft how far apart 
-c:CTtai-ii--iraiirbers--yere ts^ — 

c^r salience of these dimensions wauld be difficult to justify, since 

they are hot 'independent^ and a student. reporting the number of squares 
not matching in his or her protocol could hare been tafcing the distance 
be tweeS squares into aqcbuht as well, without explicitly reporting this 
dimension* 

A further question can be raise as to whether some of these-. re- 
ported dimensions are; realiy_ rating dimensions underlying. difficulty 
Judgments or are actually synonymous with difficulty itself* This 
would seem to be the case with Piffiensipns_b and d.in Table t4i If stu- 
dents had been asked to rate "whether 'they could readily see the sblu- 



is 



^ ■ ■ . ^ 

tion** or ""liov mtict tiae it wo^ld take to solTe- t&e puzzle, the_ratiiig 
task.miglit Ue equivalent to rating tbe dif f icultyj ana such physical 
prolilem characteristics of each pttzzie as path length and the distance 
het^een.varibusnuffihers would prStahly under ly these judgments aswell. 
It vould seem^ theu# that the dimensions most icpqrtant for students in 
etalua'tihg the difficulty of these problems were the solution path 
lengthy the number of squares not matching in the two patterns^ and the* 
d.ist:^^ce dimension of how far apart certain squares, were in the two 
patterns* Since no dimension w^ used for all prohlems^ it seems 
likely that the relative importance of each dimensibh varied somewhat 
for each problem^ depending on the particular pattern cbnf igiirat ibhs to 
he compared* 

ISliii^Sli gi#f esgnceg in M fan Perceived Bif ficulty 

_ _ fable 15 summarizes the meaii difficulty ratings^for each of the 
four IS-puz 2 le problem sets separately* These data show that there 
were_§ubstantial individual differences in the level and variation of 
difficulty perceptions, even for the same problems* For example^ for 
Stimulus Set i, although the average student thought the problems were 
2asy or SomewIiat Sasy,_pne student thought the average. prdblem_ih the 
set was Very Sasy and another thought the average problem was Somewhat 
Bifficult. Individual differences inperceived difficulty of the prbb- 
lemswithin stimulussets was also evidenced^ since about two-thirds of 
the students utilized all_. six rating categories^ but about one-third 
utilized only the four easiest categories, and one student rated all 
stimuli with the two easiest categories. __¥ithout data for the same 
students oh an independent rating task irrelevant to the difficulties 
rated here 9 it is hot possible to determine to what extent these indi- • 
vidual differences reflect ^response_biases in the _.use- of category 
rating scales? but it seems reasonable to assume that the differences 
found do indicate some true perceptual differences in perceived dif fi-^ 
cuity* Presumably, these dif f^ ref lect*ihdividual differences in 

the BhtiLly to visuaJLize and to aaixttain a sequence of ibov^s in short-^ 
term .memory* 
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Stlimilus Individual Mean totlngSr : 

Set Ldvest _ _Heah Slgfaeag 

l.i2 Very Easy 2.57 E^/Somewfaat Easy 3.77 Some«mat Difficult 

2 . 1.94 Easy 3.13 Somei*at Easy 3.82 Soare^Aat Difficult 

3 1.69 Easy 2.63 Somewhat Easy 3.44 Somewhat Easy/ 

Somewhat Difficult 

4 1.76 Easy 2.86 Somewhat Easy 4.12 Somewhat Dlfficxilt 



ggfleiyed gifficujUj' and guchef of Moves - 

fha^t the obtained, individual differences ih_ perception seem to he 
reliahie. is_ suggested. hy the data in Fignre 3, _ Which shcvs the per- 
ceived difficulty ratings cif four students within Problems 9 and IG as 
the distance in obves from the. start pu22le configuration approached 
the_gbal pu22le_cbhfigurattohi_ These graphs were obtained by having 
studeits rate the' difficulty of r^ not only from the 

start conf iguration i but from various intermediary configurations be- 
tween the stjart :and goal configuration. Thus, for example, . in Tigure 



- 42 - 



3a it migSt te presumed that if Student 9 were actually attenpting to 
solve Problem 9 i t£e puzzle would look ScEewhat Difficult to iim or_lier 
UE.tii he or she was atout 7- ibves away ffei the goal; then difficulty 
would dro.p off rapidly until he er she Was 5 or 5 aib7ss fron the goal, • 
when the puzzle would appear to he 7ery Sasy. 



Note that across hoth thejrohlems shown in Jigu^ -Sf- thevf bur ^ 
students show marked consistency in how they perceived t&e difficulty 
of different puzzle distances. For example, Student^^^. perceived hoth 
prbhlems as easier than the mean student at all distances' ffbi the - 
goal* whereas Student 6 perceived hoth prohleas as more^iiffictiit than 
the mean student did at all distahCes from the s-oal. v£ven^i;hdugh only-, 
a few examples of students and problems are shown in -Jigure 3,/this v^, 
tendency for reliahie individual differences in difficulty perceptions^.^ 
'was Bresent in nearly all cbmhinatibhs of studehtsyand prohlems examr 
ined*. These data suggest that if the differehceS^ih difficulty percfs^qy 
tions relate to per^ormancei then reliable individual performance ' dif-r ^ 
ferehces in solving these problems should be bbtainable*. 

RelatibhShip of Difficulty lirceptiOB and Bath length . ; _ J^^- 

Since path length seemed to be a dominant dimension in the/^^^tud'eht 
prbtbcbls, difficulty perception scale values were correlated and plot- 

^ ... 

Figure 4 _ 

Bivariate Distribution of Pefceiv^ Difficulty^ 
S ifean Scale Values and Path Length for 67 PumI^^ 
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ted agiiast pttli length fo-t all 67 tiuzzles. _ _Figure_4_shows the scat-_ 
terplot relating solution path length of each puzzle to its aeanscale_ 
taltie^ Althbtigh the (sorrelation. "between the twp_7ariaSle_s_w •SS^_the 
relationship he tveen the two Tariatles was not strictly^linear at the 
rights orhight end of the_plot% Although end effects must always ^e 
cottsidered in category rating scales* the_fa_ct^ that students could hare 
assigned higher ratings at the high end of the curve would_ suggest, that 
the flattening of the curve for long .path lengths represents a^real _ 
effect • Students apparently could not discriminate differential path 
lengths greater than about 16* FesSaps a secondary rating diiensibnt 
such as the distance hetweea numbers in the patt^ the number of 

squares not matching in the_two patterns, is important in differenti- 
ating problems with longer path lengths. 



Figure 4_ also provides estimates of how difficult puzzles with 
different path lengths will appear to the average student vhen begin- 
ning work on a problem*. A puzzle perceited^td be Very Sasy would cor-* 
respond to a value oh the vertical axis-itt_Iigure 4^betweeh ^5. and 1.5; 
Easy puzzles would range from i%5 to 2^5? Somewhat 2asy^ from 2«5_to_ 
3*5* Somewhat Etfficult^ from 3^5 to 4^? Bifficult* from i^S- td^5*5? 

* ^Figure 5 - . , 

Bivarlate.Distributioa of Staiid^^ Deviat±btis_6f 
Perceived Difficulty Ratings and Path t^gth for 67 Puzzles 
i.57n a 
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and Tery Difficult, _from_5. 5 to 6»5. Solution path lengths correspond'^ 
ingto the difficulty categories overlapped sofflevhat with 7ery_Sasy 
ratings cdrrespdnding to puzzles reAUiring 1 moves each^ Sasy for 

puzzles bf_4 to 10 mbves# Somewhat Easy fbr^puzzles ranging from 5 to 
16 Eoves^ Somewhat Bifficult for puzzles requiring 8 to 18 moves, and 
Sif f iault for. puzzles with from IS to 26 mo ves* None of the puzzles^ 
used here, which ranged from 1 to 26 moves, were rated Very liffibult 
hy the average student. 

- Figure 5 shews a_plbt_bf solution path lengths versus the standard 
deviations of the students' category ratings. These data demonstrate 
that although students tended to agree more in their difficulty percep-t 
tions for stimuli with short or very long solution paths, there was__ 
substantial disagreement In perceived difficulty for puzzles with path 
lengths_in_ the middle range, with a pealc disagreement for 'solution 
paths of ahbut 10 moves. 



BisetrssioM and coNeLiisiONs 



Prohlem Characteristics 




The data suggested t^at fbur performance indices might he useful 
in indexing problem difficulty: (1) the in the 

sample^ (2) the proportion of' students s olving a prtfblem , (3"^)- the pro- 
portion of students solving a problem in the optimal number of moves, 
and (4)_the Special Difficulty Index.. These four indices showed sub- 
stantial agreiment in rank ordering the difficulty (vf the -prbbleics . 
~, _ __ _ 

Because it adjusts for differences in solution ^path length. while 
also taking into account the average number of moves required ^ the 
samplei the Special difficulty index not only appeared to b^ the best 
index of problem difficulty but also correlated lower with the solution 
path length of each.problem than" the other- performance, indices used to 
estimate prpbiem difficulty. This is a desirable situation, since 
Ibhger puzzles were. nbt always -the mbst difficult^ Future research 
with this problem type^ should consider use:bf some shorty but less - 
direct or obvious^- prbbleasi ' . . 

The number of illegal and: re|)eated_ moves were found to be too low 
and not consistent enough for individuals* across problems tb be useful 
performance indices^ at least for this prbblem set and sample^' 

Sxamihatibn of prbblem performance indices, the Special JDifLi^uH^^ 

Jndex, andl.students ' j^^^ the_jiiffJ.cyL^ the^tes^ * problems 

indicated that with the.exception of Problems 9, 10, _ T2f"aM7I3l"ih"e 
probleins were too -easy*, for most students • lor example, exce|jt;/ro_r . 
these 'four ^rbblems ^ .70^ br mbre bf the students sblvld each of the 
.remaining prbblems_ in the minimum number bf moves. It seems likely 
that these highly skewed distributions bf number^bf mbves cbmpletibh 
precluded high correlations of individual performance indices^acrqss 
problems, since small absolute differences in scores across problems 
would be accentuated. Thus, the consistency acrossproblemspf the 
number bf_mbves to completion was generally pbbr, _with_ indications of 
billy small tb mbderate consistency fbr clusters of problems* bf liinilar 
diff;icultyi It is possible that if a mbre d:i'fficul't set bf. problems 
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that were more^ Similar in difficulty levels were_a^ "better 
measures of consistency of performance wouldJ)e obtained* The item-to- 
tal_ sc_dre correlatioiis_ obtained for ea'Sli. problem' suggested_tliat_ it 
would he possible to chtaitt a more discriminating suhset_ of prchlems . 
Because this was an_ eiploratory study ^ however^ no preselection of 
prdhlems-was possihle^ ^Since the data. suggest that "better cbhsistency^ 
may he ohtatned_ t^6thg prohlems of similar difficulty levels, an adap- 
tive test# which tailors pfohlems to the ahility level of each student, 
should increase the rellahillty of meas^urement • 

^£2£iSS Mgthgds 

_ fbur.aiterhative method^ ef scoring total test perfofmahce and two 
methods of scoring individual prbhlem performance were studied* The 
- scores that took into account differential numhers of moves (Total 1^ 
fptal_2)-hetweenthe_Dptimal and maximum numher allowed appeared to he 
the hest» on intuit ive grounds j_ c.nd were_also related somewhat more to 
Judges' performance. ratings* The. Total 2 scbre^ which' also took into 
account the dif f iculty bf the prbhlems the student attempted, appeared 
to he the most meahihgful^scbrei Where other methods rank ordered stu- 
dents__dif f erently, the_rank_Q^ provided hy Total 2 ,was' mostV 

^highly related tp_ Judges ' perf ormance rat ifig Although' Tiotal 2 may 
appear to h^ additive in that:it averages .individual, pro^hlem scores 
{Score. 2) ^ "roe pattern 9r_cbhfiguratibh >.bf_ individual prbhlem perfbr- 
mance is_ taken intb cbhlideratibh^ . since the ihd-.vidual prbhlem scbres 
(Score 2) areadjusted for the difficulty of each prohlem, as reflected 
in the mean performance of the sample on the prohl^; As a result, 
students arepenalized more for poor' performance on easier prohleifiSt 
relative to the group, ' than they are on more difficult prohlems* ':In 
this way students _whb sblve the same numher of ..prbhlems hut have dif- 
ferent patterns bf perf brmahce- will bhtain different Total 2 scores* 

Future research with this pro hlem t^ require study of _ the 

validity of the various perf ormance scores against relev-ant external 
criteria. Since no such reliable criteria were available in this, 
study^ _ the meahingf ulness bf the scbres was tentatively determined hy 
* cbmparing these bhjective scbres with judges' performance ratings of 
test perf brmahce* Strong ihdieatibhs bf cbheurreht validity were 
-founds Those cases in which the ohjectlvescq dif- 
ferently than the_ ratings indicated that whereas, the ohiective_ score 
(Total 2) peGalized\ students more than ^judges' ratings for poor perfor--^ 
aance oil easier prohlems, the. fudges, penalized^ studeh more for not 
attempting some prohlems (although this_was_hbt always the_ student 's 
fault) and fbr doing pbbrlybn more difficult prbhlems^ itlthbugh it is 
-difficult to determine which measure is more valid without an external 
criterion, the high correlations between the objective scores and the 
judges' ratings suggest some validity in both types of data. 

Latencies 

Mean initial and to tal latencies^ for each problem were strbhgly 
related to some of the performance indices of problem dif fi^ That 
ist the group as a whole utilized_ longer initial study times and longer 
tbtal_work times on more difficult problems*. Similarly^ problems _ that 
took longer to solve were, initially^ st.udied lohger_*_ The average laten- 
cy bf ffibves witMa a' problem did hot relate' to prbhlem difficulty* 



At the level of _ individual perf ormanc:e , only total latency, or 

Brotlem solution time w^s related, to problem perfofmance* Some good 
proliiem solvers were characte very long, ini tial latencies^ but 

this tendency was iidt universal. Many good problem solvers did not 
intially stuay thev prdhlem longer than dfd the average poor pro^ 
solver. The average problem response latency measure did not relate to 
individual student perf ormancesi 

Plots of latency trends across problems were interesting from a 
descriptive point Qf_ vie? in indicating. that most students' trends 
showed Iflnger initial; latencies followed by a few^ quicker foves, occa- 
sional spikes iiidicating_re-eyaluati on of progress, and finally several 
v?ry quick final moves indicating that the sequence of _ moves to solu- 
tion had been detected.' However, no universal trends in response la- 
tencies seemed to characterize good problem solvers. versus, poor problem 
solvers veil enough to be useful in scoring. or- predictingindividual 
p.erformance. latencies in this study seemed to confound differences in 
the ability to visualize a sequence of moves and differences in stu- 
dents^ work styles i Strong evidence for such work styles was found in 
the consistency of initial; average, and to tal response latency mea- 
sures across all problems. , Students who took longer initial study: 
timeSf longer average times between moves ^ ^nd longer total work times 
on one. problem showed a consistent tendency to do so on other problems 
as welli 

Tfius-f—whtle the response latency measures were predictive of pro b-;- 
lem difficulty and indicated the existence of consistent styles of 
problem-solving behavior, they did not a^ppear to be useful in scoring 
. Individual performance.- 

Motivational and Elbgravphi eal Correla te s of Pe rfor mance 

Although the posttest reaction questionnaire indicated that only' 
40% of the. students_had_never_worked problems of this type before, mean 
performance scores between these s "udents _and those who had previously 
worked su^ch problems were not significantly different. 

Only 36i7% of the students reported trying hard to^ solve all the 
problems in the minimum number of moves. Slightly more students said 
they tried hard to solve* most, but n^pt all, of the problems. Although 
mean^pe.rformance differences between. subgroups .report^ ng. different 
levels of motivation were not significantly different, these data plus - 
the fact that 52% of the students felt their motivation was affected by 
the length of the test indicate . that total testing times may neel^to be 
shorter, for this type of task than for tests with more conventional. 
iteiD formats. * t ^ 

No sex dif f e.renc.es in performance. were found on this, test* That ' ^ 
males, typically show .better. spatial ability (Garai & Scheinfeld^-lS^S; 
flacGoby & Jacklin, 1974) and restructuring ability (MaeGbby, 1566? 
Sweeney ,^1953? terman & fyl.er, i954] Tyler^ t§65} would seem to^jpridi-c-t- 
male superiority, on tliis test* On the other handj_,£em44^s"h^av gener- 
ally, been found_to be less imgulsive_j[J^4^ Terman S Tyler, 



1954; Tyler* _1S65J _a^:Jte4^ter''In'';^ 

Scheij^feld-r-^-iSeBt;^'^^ tb obtain sex^ f f er.ences wi tH^ this . 

rype" of task wi-lj only be of ebneern ehee mere reliable measurement is 

er|c . ' . . 5s . 
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achieTSd. At that ticef hypothesized correlate of these. prbbleas - 
should he examined' to determine whether scores index spatial reasonings 
rest rue titrihg, iopulsivity, or some other psychological yariable. 

Bimenslon s of gercelTed Mlficulty 

- - - - . . . . " - - - . - 

^ The most salient dimensions bf_ perceived difficulty were the • ' 
number of movers re<iuired to solve the puzzle, the humoer of squares not 
matching in the two- patt^ernsi . and the distance dimension, of how far 
apart certain squares wejre in the two patterns. _Since no dimension was 
reported .as having heen ljused for all prohlems, it seems likely that the 
relative importance pf ^ch dimension varied somewhat for each prbBlem^ 
depending- on the particular pattern, configurations i 

W§eD tSs actual yalues of these dimensions were computed foritlie 

prptlems used in the computer-administered test (see Appendix Tal)le G), 

a hypothesized rank ordering of prohlems :dif f iculty was bhtaihed* 

These three rank orders Vefe quite similar (.51 < b < .79) hut were not 
as consistent as the rank qrderings for difficulty"ohtained.f 
fbrnance indices -such as mean numher jf moves, proportion of students 
splving the prohlem, and the Special 1)ifficulty Iid«± (see Tahle 3)^ 
Thu^, although these physical diiecsiohs may, be useful as a tentative 
•index of prohlem difficulty for use in initial prbhlem selection prior 
tc data collection^ the performance measures should provide adre pre- 
cise indices of * difficulty once normative data can he ohtaised. 

The actual perceived difficulty ratings shoved suhstahtial indi- 
vidual differences in the level and variahtlity of difficulty "percep- 
tions* even for the same set of prohlems . Although possible individual 
biases, in the use" of category" rating scales cannot be discounted^ the 
idata. suggest that the individual dif ference-s found were differences in 
subjective difficulties relating to individual differences in ability 
to visualize and to maintain a sequence of^ moves in short-term memory. 
Examination of individual difficulty perceptions across problems indi- 
cated that these differences were reillahle. These data suggest that if 
the reliable differences in difficulty perceptions do in fact relate to 
differehti.al ability to visualize successful move sequences, then an 
adequate selection of problem replications should be able to tap these 
differences, resulting in reliable perfo-rmance differences. 

- Comparison of th.e easy problems with the problems that Challenged 
Students ffiore in the cbiputer-admihistered test suggested that too many 
of the problems could, be solved in a reactive manner^ that iSj by re- 
sponding to the^immediate stimulus pattern without trying to visualize 
or to plan several moves ahead. Such problems would not tap_di^er^r--^' 
ences in students' ability to visualize a sequeiLce o^-TTo'ves because 
students would Sot find thems.el*e-s--ih"f^^>dlfficul situat'ibn by hot 
planning ah^ad^^.-Jf-he-m^o re challenging problems (e.g. ^ Problems S, lO, 
.L2.,=-andr-lt57'''were those in which a student could get in trouble" by hot . 
visualizing several moves in advance (see Appendix C). This implies 
that future. studies should include mere problems that prevent reactive 
sdlutidss* i.e.#° require more planning ahead. 

Comparison of the mean: perceived difficulty of the problems in- 
cluded in the computer-adninistered test indicated less agreement with 
actual problem difficulty than might be expected from other studies* 

= :•. X- :• • • . ^ — ^ ^ 



- 4S ^ 



This appeared, td_tje due to the inahility of _students_ to. differentiate 
the relative difficulties of prbhlems with longer solution paths^ 
ThuSt to the extent that increased EOtivatioh under adaptive testing 
depends bh:cbrrex:t student percept ions of problem difficulty (Jrastwobd 
£ Weiss t 1977)^ adaptive administration of this prohlem type may not 
have a mptivat ional advantage. On the other hand, reduced frustration' 
would seemlikely to result under adaptive conditions from not reguir- 
ing. students to work on prohlems much more difficult than_ their ahility 
levelSt even if_ they cannot Accurately perceive the actual difficulty . 
of the prb'blem "beforehand* 

The perceived difficulty scale values related hl^ 
themean initial response:latency measure for the computer-administered 
prohlems. This supports the_ idea_ that the students spend time before 
their first mbve trying tb visualize a sequence of mbvest Since path 
length appeared tb Ve a primary rating dimensibn in determining per- 
ceived difficulty. 

_ _ The results from this pi-lot study suggest certain improvements in 
prb'blem selectibn ahddesign. Future tests of this type should. consist 
of fewer but mbre difficult i)rbblems^ particularly thos^ which do not 
permit reactive • impulsive soiutio in the. 

ability to construct an optimal sequence of moves are tb_betappedt 
then_mor_e problems must be designed that force the student to plan ^ 
ah^ad^ Hbre compleS: .problems should overload the memories of students* 
and should induct differenced in strategies in manipulating the number 
patterns . : - 

If reliable pe.rfqrmance indice^ can be bb-tained, the process bf 
validating the meaning of the scores will be necessary^ Do scores re- ; 
fleet individual differences in_ spatial reasoning and problem-solving; - 
abili ty <)r_ in personality variables like perseverance and impulslvity? 
It might alsb be of interest to detlrmine what information-processing 
abilities underly performance bh these prbbleiSi Fbrexample, using 
Garroll^s (1974) pfovisional^coding scheme fbr cognitive tasks appear- 
ing in psychometric tests, the following cognitive operations might be 
expected to underly performance: (1) mental rotation of spatial config- 
urations in visual _ short-term, memory t Factors S and Tz; :(2) perX-Oxming— 
serial operations in visual short-term mempryt Fac^-o-rs $ and TzJ and 
(3) storage in and ret rtevalJ'-r^^F^h'b'ft-term membry, Factbr Ms. 

^fhe^results reported here suggest that reasonable indices of prbb- ' 
lem_ difficulty are obtainable given an appropriate norming sample. If 
reliable and valid ^^ility scores can be obtained in future studies 
with this item type t this type bf test wbuld seem especially apprgpri- .; 
ate for adaptive administration, sihca (i) scores bn pxpblems^ tailbredrr 
to the individual's ability are more apt tb be mbre highly related to 
each other, resulting in total scores with higher reliability; (2) 
adaptive administration will lilcely_imprqve the motivational aspects bf ; 
the testSt. which seem more taxing and potentially frustrating than con- 
ventibnal item formatsi and _{3) equally precise measurements for most 
Te'St e'e's~c~aii^be In shorter peribds of time: tlian with n:wv^"n-;_ 

•tional test administration^ Thus^ the data suggest that future devel- 
opment of- adaptive problem-solving tests of the type studied here mighty 
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result in new types of ability tests that should provide ability scores 
to s^ppleinent' those availatie from the paper-apd-pencil administration 
of typical atility- measiiresi * ' 
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Illegal MoTesr 

±8 is SQI i^SefliiH IN^i^ piTTllN^-SEMEMlEEL PUSH TES "SPiCE 
Bil rifiST H* IS NBHBES TG BS ENTERED CONTAINS ONIT OffS 
DIGIT. 

i0P IS NOT i eosaicf moyss the last eisieT^ typed must be in 

L» QE Di 

10 CAN NOT BE MOtBD LEfT (SIGHT, tJP, DOWN) fSOM ITS PHESENT 
„POSITipN. 



Hazisus Hbte Eisit Eeached: 

TQH SAVE fiEACllD TK flAIlHBB NSBBEl Of MOVES AttOWED fOE THIS 
PHGSiSMi PLEASE CONTACT T^ PEOCTSS* 



Computer Data Pile Error: 

Til cbrtPUfEE IS HAIING PEOBIEMS. PLEASE NOT lit. THE PEOSTOEi 
: (EEEOB 06 HAS OCCUEEED. IEEE IS -5). 



HaziQuat Time Bifflit He ached: 

IT. MIGHT BE £ GOOD IDEA TO GO ON TO THK NEIT PEOBIEmI ^ PIEASE 
CONTACT THE PSOCTOSi 



Appendix 3: Tnsi 



Sdreeh 1 ■ .■■ 

fii CQMPUTIH kill im^mslOT^ witi a sisiis o? puzzlis to woes on, 

BUI PIBST SOBE INSTinCTIONS WItl BE GItIN TO IE SUll T0U FNBEESTAND • 
IQW TO USE TIE TTPlIlIfiaiKETBOiBD TO ENflS tOUE EBSPONSES* 
m^eWINS THE ISSTEUCTIONS TOU lIII^BE CItEN A PEACTieE PEOStEM TO 
CIrEAE BP ANY PEOBtEHS T0U^^ MTIHS^ IN ADjmON* IP TOU HAtE 
6UESTI0NS AT; ANT TIME ABOUT THE ISSTHUCTIONS OE ANTTHING EISEPIBASB 
FEEI PHEE TO CONTACT TIE TEST PS&CTOE* 

ton fTJST EEMEMBEE TWO THINGS IN TO TAII TO TIE COMPUTEE: 

1. ONit TYPE SOMETHING WHEN A r^SSS AGE ON THE 

IN lEONT OP TOU TEIIS TOU TC BO SO ANE A QUESTION 
HABI i?i APPEAES^ 
2* EACH TIME TOU TTPE AiEESPONSE ON TIE lETBOABE 
TK eOHPBTEE BQES NOT EECEITE IT tlNTII, TOU PBESS 

TIE "EETUEN" EETi : - - ' 

NOW THE PIEST THNS TOU MUST DO IS f:SD THE BETUEN 

tSt. THIS SET IS THE tASSE BECTASGUME SET ON^TIE ^-^ ^ ^^^^ „ 
BIGHT END OP THE SETBOSB. ISESS TK SPACE BAB ATlTHE - BOTTOM OP THE 
SETBOAED AND TH? "EETUBN" SET TO CONTINUE THE INSTEUCTIONS. 

t : ■ . ■ : • .• 

* -■ ' . " ' L^- ■■■ 

Screen 2 




12 2 4 12 3 

5 6 7 8 - » 5 6 7 4 

9 12 li 9 10 11 8 

12.12 14 15 ^ 12:: 13 14 15 - 

IN EACH OP THE PUZZLES OP TIE TTPllSHOWN HE2B TOUE TASS IS TO 
TTPE IN A SEQUENCE Of "MOIES" TO CH^GE THE PAT TEEN OP NUM|EES- 
CN T^ LSfT UNTIi IT HITCHES THE PATTEEN ON THE ^EIGHTo -A MOVE 
CONSISTS 01 3 TTPIE CHASACTEES JOIIOWED BI I'll lEETUBN SET. THE 
?I2ST:2 CHAEiCTEBS-TEEi THE COMPUTai SIISH 8UMB1E IN TEE 

PATTEEN^ON THE tSPf TOU WANT TO MO¥E* THE THIE2 CIASACTEB 

CWHIGI TOUlliEI. BE TOEl) ABOUT SHOETBT) TELiS ^E COMPUTEE WHAT = 
EIBEGTION TOU WANT TO MOTE THE NUMBEE. 

I? f K SUBB^ TOU WISH TO MOTE HA3 2 DlGliS TOU SIOUBB Tn»S 
THE 2- MSITS ON^THE SETBOAED. IP THE NUMBIE TOU WISH TO MOIE 
IAS ONiT 1 DIGIT^OU SIOUID TTPE THE SPACE BAE ONCE AND TIEN f HE 
DESIE^ DIGIT* THUS* TIE TWO DIGIT- NUMBEES 10 TO 15^ CAN BE TIS'ED 
IN DIEECpT, mil TIE /SPACE' BAE MUST BE TtPED flEST WITH 
THE NUM3EES 1 TO 9* ^ — — „ 

PEESS THE "SPACi BAE AND BETUEN TO CONTINUE. 



. ' 1 2 Z i ' . 12 3, 

5 6 7 8 5 6 7 * 

S 10 11 S 10 11 8 

12 13 1* 15 12 13 14 15 

isiMSSfiONSS iiott ms fRtas cmiefifi in toto "mot^" fSiis tie 
cdMpem WHAT mHiciidS to mote f hi swbei is thi ieit patteen, 

MUttBlSS (Ur OMEt li:MOfK INTO THE SPACE IN THE S^UAHE PATTERN 
liiCi IS NOl-^CCIIPIEDi BT A NUMBER* TOO TEIlk THE COMPUTEH WHICH 
riBECTIOH f 0JM0TE:LT1E HUMBBE BI TIP INC^ ONE Of THE lOIEOWINC 4 IBTTERS 
L * IT 105 WANT TO HQIl A HffHBKE TO SI till ONE SPACE 
E - If lOS WANT SO ROTS A NHMBSl SO THE BIGHT ONE SPACE 

^11 TOff WANT TO HOT! A NHHBM HP ORE SPACE 
S - I? lOHjWANT TO HOTE.A fSUmm BOWN ONE SPACE 

TiOS.^ IN THE PAT3S1N SHOWN HE1£ TQ POEEOWm 4 HOTES Ap 
POSSIBLES 101. li£»: t&i Ol <SfAei SAE>7Bi ; ANt OTHEE MOTE 
WOUEB BE IIIEGAE ANB SSStTET IN A^ESMINDSB MESSAGE |S|NG 
lEINTSB Bf THE CdMPtJTEE, EOE EXAMPtE, JQt COtttB NOT TEt TO MOTE 
THE. "11^ SQUAEE TO THE EIGHT ONE SPACE. SINCE Alt MOTES MUST 

STAI WHHIN THB^SQUAEE' PATTERN - : 

KESS THE "SPACE" ANB "EEIHEN" TO CONTINUE INSTEUGTIONS . 



Sareen 4 

li IBH^lATi BABI" i^EK^lHOTE TE COMPUTES Wm AHTSHAflGAEET 

ARB TBIT iUIGST BPEAIE THE PATTBEN ON THE EEET WHB^ 

TOU ARE fllEiSG TOUR M0I3S* II TOHE MOTE" IS NOT EESAE A 

MESSA^ WIE£ BE ^INTEB UNBM 10^ MOTE ANB TOU SHOUEB TBI 

AGAIN WHEN THE COMPUTES TEEES : T0U TO ' BO SO ^ 

13 tOU &BB EATING BiiTlCUETT UNDERSTApING THE |NSTl!JCfTONS 

SO f AE PEEASE CAEE ISB PEQCT^ PRESS THE "SPACE" 

EAR AND "RETURN" TO CONTINUE TH3 INSTRUCTIONS. 

1 ' ■ ' 

* . . • - ■ 



SUPPCSl IdU MAZE A MiSfAXE TtPING SOMETHING INTOTHE 
COMPTER. TOU CAN CORRECT A MISTtPED CHARACTER AT ANT 
TIME BBIORE TOU PRESS THE "RETURN" KBT. BT PRESSING THE 
"BACISPACE" KEI- which: IS EOCATBB IN^THE TOOlGHT CORNER 
0? THE KET30ARB TOU JIEE "ERASE" THE EAST CHARACTIR TOU 
ITPEBj. TO "ERASE" JHE^; EAST TWO CHARACTERS. lOU TTPEB PRESS 
THE BACKSPACE" XET TWICE ANB SO ON* 

AETSR PRESSING -"backspace" THE CORRECT CHARAC^R CAN THEN 
BE TT^B IN^ REM^BS TO PRESS THE "RETURN" lET TO SENB 
THE COERSCTEB CHARACTERS TO THE COMPUTER^ 

. TO SEE HOW THEi"BACISPiCE" WORIS TRT TTPING THE MOTE 'i4D' 
CN THE iSTBOARB. THIN CHANGE TEE 'B' TO A 'U' BT PUSHING THE 
*BACXSPACB" IET ONCE ANB THEN TK CORRECT EE^ . 
; IINAEET^ PRESS THE. "RETURN" KET TO SEND TOUR CORRECTED MOTE TO 
, THE COMPUTER;. 
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TOO ISE NO^I ALMOST BElDf TO BlfflN WOEIING. HIST, HOWEtSH, WE NESi) 

SQREISJOIMIT ICS ABOUT ion* _ ^ _ _ 

TEE.HESHBTS If TIE PIOBLIHS lOH. Sm SOBI'QN fILL BE ^ 
STUGTJT eONFIlEHTIil^ WE ilE INTE5SSTSB IS TOU AS PAST 

A tASSEl SSQBP, ANB AT NO TIRS WILL TQUl SGOHIS IE 
COSNSCTIS Wifl TOtm NAMBi 

B7T VE NEEfi ISENflJie^TION SO TlAf VS SAN .TO^i ANSf^S 

SEPABATE ?EOM OTIEE 'PEOF£S'S ANS SO TiAf WE CAN OOMFAp TEE 
llEStJITS or THESE SCdfiSS WITH AMt OTBEE DATA .CONTBIBtJfED Bt 
ton AT AM EAHIIIB OS lATEE TIME, 

llEASB^TtPE TOUE^IISST SAME (JITST tOUl TllSf NAME THIS TIME), 
AND THEN "HEfpii". 

I • ' 

S&^en 7 

PiBASE fTPE TOUl MiDD|E iNITIA| (QUE I,ETTIfi ONLf j. 
IE TOU DO NOT fiA7E A MIDDIE NAME, TTPE A "t". 
DON'T EOfi&BT TO PfiESS "BETUSN". 

1 ' ■ • . 

* 



PiSASl TTPE TOUl DAST NAME AND PHESS EETUHN 



PiEASE TTPE TOUl SIX 01 SBTEN SlSlf STUDENT IDENTIEIOATION NUMBES 

AND "HETUSS'i " 

if TOU DO NOT ESMEMBES TOUl IDESTITKATIOH NUMBES AND DO NOT 

EAp IT WITH TQU OAli'TlE PEOeTOS fOE A SUBSTITUTE IDENTIfieATION 

NUMBEl. 



Scpien 10 



NOW WE WeUID mE TO KNOW A fEW THINGS ABOUT TOU. If 
THE OUlSTIdN DOES NOT APPtT TO TOU 01 TOU JON 'T WANT TO 
HSSPONDj TTPE IN A QUESTION MAIK A0^"EETUBN"* 
PLEASE TTPE TOUE ACE AND PEESS" "HETUEN"* 

7 . . 

* " 



WHICH SEX AEE TOU? 
'1* fEMAHE 
V 2i MA|E _ _ - 

V- - -fTPE THE eOHEEeT NUMBEl AND PEESS "lETUEN' 

- \ 1 . . ■ 

\ • ■ 
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' . ■ Screen 3 2- ; 

itSiSI TIPS TEE NUMBSS COBBESFONBING TO tOUS 7SAB IN SCHOOL: 

1. lilSiMiN^ 

2. SOPEOBOBS 

3* . .. 

4* SSNIOH 

5^ ®ipiT2 ST^BSHf " 
6. OTSBR 

BOM'T fdaSBf TO PBSiSS "BBflffiN"^ 

i 

Sereeh IS 

lism Bitos isB slfmi of the coiibses wictin tIe uNimsiTt. 
2^ GOJEses 01 isfiieirmHE 

3. 60££E@E 0? i20£0S£€i£ SeilNCES 

4» GO££E&£ Of lOSlNSSS iSMlfilSTBtTION 

5. GO££S&£ Of iSUaif £6N . 

6. &SNSBi£ eoiBElE. ^ 

7» GOILSGE Of HOKE ECONOMICS 
a. INSTIf UTS . 0 J TECINOIOGt 
9. SCHObl Of EOBESTBT 

10. UHIfEHSlfT COSEECE ' 

11. COIEECS^Of lETElINm -MEIICINE 

12. GBADUlTi SCHOOL 
: 13^ LiSlSCHOOL 

14* QTHS 

PBESS THE NnniES Of THE SCHOOL IN VEICH TOU IBS ENBOLLE]} AND 

THE "iSTto" fSTi 

? 

* . ' " ■ 

Screen 14 

utr Am re t/mtb b k nv> 

- u ifse-^iBicis (iLicx) 

' 2i MEXietS-iMEBICifi 
3i PfEBTO-BlGAH 

4. OTHEB LiTiR AMEBIC AN 

5. OBIEN-TAL OH ASliN-AMKiCAN 
/ 6. HATItE-iMEElCAN (IlffiliN) 

7. WHITE 
8. OflfiSH 

TIPS THE NUMBEB THAT CI7ES TOUE BACB, AMD PEESS "BETIJBM". 



ERIC 
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; Sareeh 25 



IH WHiei eiTlSGBT IS TQUl CUHUIlTITE GEA33I-POINT (5PA)? 

1) 3*76 TO 1.20 

2) 3i51 TO 3*75 

3) 3i26- TO 2.50 

4) 3.01 TO. 3.25 ' 

5) 2.76 TO 3.00 - 

6) 2.51 TO 2.75 

7) 2.26 TO 2.50 ' 
a) 2.01 TO 2^25 

si 2.00 01 LESS 

i TTPB TH2 CiTECOfil NUMlEl ( "l" THfiOUCH "9" ) AN]} PHISS "aUTUHN"*. 
? ■ ■ , 



Screen 



IQB'ABl HOW ISIEILTO TET A PSlCTICIlPBOBISM. 

IS Tis piieTiei fsobjer asd tbb actuai pbobiIms to loiibw 

AN IMPOBTASf SOA£ IN TITIHS 50 BAO TIS PATTSEN ON TfiS LBIT 

«ATCE TIE PATTESH OK TBI EISIT IS TO BO SO WITI AS 

fEW KO?ES AS POSSISBEi -tQ0fi PEBI^HANCl WI££ BE DETEBMISEP 
NOT ONLI Bt WHETHES tOt ABE AB£E TO MATei THE TWO PATTEBNS 
BBT A^O EI lb¥ ?JWlM07ES IT -TASES T0U : TO BO SQ *^ ^ : 
TISBS is NO TIME IIMIT ON ANt' OF THE PUZZLES BBT TBI TO 
USE TQUi TIME IlSlit WHiiE STIII TBtlNC TO tJSB AS FEW fiOFSS 
AS PCSSIBIS. Til TOjJOMPtm EACH PlOBiEM. 1?^ lOWEVEHi 
HA?1 WQIXSB A lONC TIMS 0»^ A SINSIE PEOBIlM ANB EEEI TOU OAN NOT 
S0£TE IT SGHTAeT TEE PHOCTOE §20 Will ffEf THE eOMPSTEB TO 

PllSINf TIE NEII PBOBEEM.^^^ ^ 

A S^MSAET DJ HOW TO TIPE-IS TOUE f HEEE ClAEACTEB MOVE m 
AS BBSeSIBEE EiBillB WII^ PSSSENTED WITH BACH PUZZIE 
AS A SEMIS3Mi. 

IF tOU Ekft ANT QBISTIOMS ABOUT WHAT TOH_ ABE ^SCPPCSSB . 
T0J3b CA£I THE PBOCTOS. OTHEEViSE PEESS THE SPACE 
BAH AND "EETUBN" tET TO BEGIN TOUB PIACTICE PS0BBEM. 
7 

* ■ 
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start 9M &oei CoBf Ignm Id&s CoeiiitiitiBs ibe Fr6bleirS6lTin« Test # 
cud fslties of Tsrtotts Pkjstcal ProMea eharacteristtcs 



ummmmmmmmmmmmmmmmm^mmmmmmmmmeimmmtmmm'%u 



Physical Problea Cbaracterlstics 



PatterD 



Goal 



ib^ Of Ho. of Ro. of 

Solution Squares loirs ' Coluns Cltyr 

Path Hot Not . Not Sucltdean Block 

Length riatcbins. Matchlng__Matcglng Dis tance Distance 
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• . Appendix C (Continued) J 

Start aiid Ceal CenftguMtions CODStituUng thc ProW Test, 
and tallies of farlbus Pbyslc al Protlea Characteristics 
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Prbblea 8 



14 


10 


13 


12 


15 


14 


13 


12 






15 


9 


8 


5 


11 


10 


9 


8 


12 


11 


11 




7 


1 


7 




6 


5 






'4 


3 


6 


2 


4 


3 


r 


1 










• 

















Problw 9 



i2«00 



12 



1 


2 


3 


4 


1 


'2 


3 


4 


5 


6 


7 


3 


12 


6 




8 


9 




10 


11 


10 


5 


7 


11 


12 


15 


14 


15 


13 


9 


14 


15 



12 



10.24 



12 



Prbblea 10 
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Problea 12 
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Thank yro fbr yoar participat^ion* Si this study, you will Be asked to 
sort cortaixi puz^Teg into piles leased oti how difficult they apprax to yotii^ 
Altdibagh you will ab* actually «5iye the puzaies yourself , you will need to 
Inow how they 'rould be solved: so that foii can estimate how difficult tiiey 
would bei Ail ^/puzasles wili be of the type pictur«i here. 



Mdee your moves in this p attern 



Try to isatch this pattern 



1 


2 


: 3 


*^ 


1 : ■ ^ 
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5 






7 
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9 
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14 
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12 


14 


15 



Figure 1 

fh* ^y^ to solve ^^e pusles is to *^DOve", the ntrobers in the left pattern 
so that t h^ left patten '^il iotch the pat^m on the ri^t. _A number. 

isy only be moved into ^e bla^ s^^e in tte left pattern. For eacamplei 

to solve particular puszle (Fig. 1) one must make 3 '^ves'* as follows: 

1 

WSst^ by Sov£^ the "9'' up one square in the left pattern, we obtain 
Se fo llo w in g nw pattern: ^ 
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4 
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10 


ii 


12 
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14 


15 



Figure 2 



Hove 2 . ^ :: ^ , 

By moving the "13" iq> one square in this new pattern (Fig. 2) we obtain 
^e fbilbf^ng pattern: ' 
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12 




14 


15 



figure 3 



Hove 3 



" F inall y j by moving the_"12" r±ghc^^^ obtain, the fbllowing 

pattern ^^cfa sotves the puzzle since it matches the original right-hand 
pattern in Fig. 1. 
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6 
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±3 


19 


1? 
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14 


15 



Figure 4 

If kt vd^ point you da not ixnderst^d how these puzzles are solved^ 
pieas^i r r^ct the proctor before reading on; 

Ybtt be present^ with a nusnber of these puzzles of varying difficulty 

Tour task is to stxidy each puzzle and, teeping in tiiind how such ptzzzles are 
solved, estimate how difficult each puzzle would be. Tdu should do this uslnj 
^erfoHowing steps* Ybxx should complete each step before going on to the 
nsoct stepi If you hate «ry ^^estions don't hesitate to contact the proctor. 

St%p I Sort of Puzzles > 

_ First, .study, each puzzle and place it in one of the six piles provided 
the proctor 'labelled: 

Very Siffictat^ Btfficult^ Som«yfet Stfficult, SomCThat Easy^ Easy, 
Tery Easy. 

There is hb requirement - that each pile contain a certain huxnber of 
pozzl^; You may feel> for rrgnnple^ that none of the puzzles fits the 
description ^'somewhat easy"; Just place ea^ puzzle In the piie^that you 
feel provides the best description of how difficult it would be to solve 
th& fizzle. Tou steuld try to make ^mr initial placement as accurate as 
possible but you are fjree to change the Ibcatidn of any puzzle you wish if 
yoo change your mind about its difficulty. Remember, that you do not have to 
actually solve the puzzles. Just study each puzzle long enough to feel 
TCas o nab ly confident about ^&ich pile to place it into. 

A few of the puzzles rantain a puzzle number and /the message 



yoig reason (s)' -on the top. For these puzzles, you should write down the 
^puzzle number shown, the pile in which you placed it, and the reaspii(s) for 
'vby you are sorting the puzzle into that pile. Use the space provided Just 
be3^ f or this purpose. 



For CTan^le^ if you feel the puzzle would be "very- easy" to solve then 
^ace the card in the ••very e^y" pile and e^lain^y you think it would be 
•Very easy" to solve next to the puzzle number on the Data Sheet. Do not 




i|:>; j§s^j^^te_a reason "Be^ise ±t is TOlved^^ eMilyjar very eMiJy 

^. w 'very gnlckiy^^- apiais hog you decided co woiiid be very easy^ that is^ 
^V:: oavvfaac basts did you, decide to sorr it into the *Very easy^* pile^ 



Puzzle Number 



Assigned Pile 



ReasouCs) for sorting into the Pile ^ou^id 



St i gp 2 • Secord sorting results 

Each puzzle card has a nim&er oa the back. Vlw^ you have finished 
sor ting " the puzzles into the 6 piles ^list these nuisbers under the appropriate 
la^cS. belov^ There is no required number of puzzles for aay category* 



Very Diffi culit: ? Bifficult 



5os^ia£ Uif f icalt 



Step 3 • Subdividing the »6 piles 



Examine the puzzles you have sortal into each pile in Step 2^. You may 
fe^^^at not_all puzzies_^a given pile sem eyraHy difficult co you evm 
^ o u ^ cm all be^des^ib^ as "^^y difficult^ » or ^SOTi^^at easy'* for 
ezxap:^^^ If ybu feel this is tise case, subdivide the puzzles vtthln each of 
tha original piles into as nny snailler sulHpiles representing different 
dagrees of difficulty as you can* Only create more subpiles if you feel 
jw cani^distlngnish differences in difficulty betveen the puzzles in a given 
^la« If you cannbt differentiate the^ difficulty of the puzzles i^thin a 
given pile then do not subdivide P^e any further. Continue subdividing 
the piles until ybu cau no Icmger differCTitiate the diff laxity of the puzzled 
is each pile* During this step you should only compare and subdivide 
IRIztIct within each of:' the original six piles separately* Do not swlt^ 



I^Bzles from one of the original 6 piles to another one, for csas^le, from ' 
"^•toa^ to •'^ry Easy*'. 

If 'vheit yi^ have completed tli^^ step you have been able to subdivide 
soj of the origingtl 6 categories, list the card numbers in each pile in the 
8p4u:e provided belov« ^eh you list the subpiles alva^ put the hardest puzzles 
i^t&ln a category in s^pile ly ^e secd^ hardest puzzles in st^piie 2^ and so bn< 



erIci 



gery Btf f iciilt 



Difficult 



subpiies 

i 2^ 



Somewhat Diff icult 



subpiles 
1 2 ... 



su 



1 2 



Easy 
i ... 



Very Easy 



subpties 

T_ ^ — 



Step^ 

Please answer the fbllowliig questions as completely as possible. 
1; Your Tirtnip — 

3 



Tour student identification number — 

Before today, how often had you tried to solve the kind of ptxzzle you 
were asked to estimate the difficulty of in this study? 



ai never 

a few times 

many times t. 

Hot? mucli difficulty did you have tmderstanding xAat you were supposed 
to do In this study? 

a. no difficulty 

b. a iittie difficuV.y 

c. much difficulty 

Vhea you sorted the puzzles into the origiral 6 categories, did you use 

iny_*'rules" or criteria for sorting something into "very difficult 

••difficult'S "sonc^^t difficult", "somewhat easy", "easy", and "very easy" 

SO 

If soi^^at ^^e they? 

Vfery difficult - 

difficult ^ . 

Somewhat difficult ^ 

Sbme^at easy • : 

Easy - 

Very easy - 
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6* If 70U were afile to sufidivlde the origizial 6 piles Ixitb mote piles la 
Step 3» ^^t basis did you^do so? ffaat Ist Eov did you decide vfaicb 
pozzies wfrhtii a; pile were srare difficult than others? : 



7^ If Tpu did not subdivide any of the original 6 spiles, try to e^qplaln 
^^ly you conld tidt do 80« 

Sm ' of ten did you ose each of following considerations in 

deciding how diff icni£ k piazie wonid be: 

a« Tbm number of '"moves*' reqoired All^ Mgsfc^ Some Sow^ ('of the puzzles) 
to solve the puzzle 

_ _^ ft 

b. iBiB nunber of *^nnnbers'* which AH Ifost Some Sone 

did not mat^ is ^e two patterns 

c» IHiether in one of the patterns 
the numbers. ^re_ in numeric 
order from 1 to 15 

d^ &w far apart certain numbers 
were in the two puzzles 

e* Ihe nbmber of rows in the two 
patterns that did not match 

f « The location of tile " empty 
space" in the left pattern 

g* The nuadser of columns in t^xe 
two puzzles ^!^at did not mst^ 

h^^feetiier you could "see" the 
actual sequence of moves thas 
would be needed to solve the 
problem 

i* The amount of time^^ would 
taiee to solve the problem 

9* Did the length of thin study affect ^>ur ab iJ^^ taste 
required? _ 

a« ' ^t^at all 
b* ■ somewhat 

e* quite a bit - ' 
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Ar^t "Mps^ Some None 

i*^"""* 

ATT Host Some Hone 

Mi Mpse Some None 

AT T Thfaq^ j^atn^ TJoft g 

All ^st Some None 

Ar'tT >foff^ ghtgg None 



AH Most Some 'None 



iSi few did you fe^ about working on this study? 

f disliked it a lot 
bi 1 disi^ed it somewhat 

c. I feit*nctttrai^bdut it 

d. 1 eajoyed it some^xat 

e. 1 enjoyed it a lot 

ii* 4^ further cchnmrnts? 
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Appendix E * 

of W^oevoed TSiffjculty 

a- The mn^er of osoves retired to soive t&e puzzle or atr tt::2:plicatloii of t&e 

actual wives tte^^: 

"It ossly took a few moves'* , 

"The •12' and *13* will go around comer into places others look 

like^ey will »>ve easi^ - 

!>• Whether ggbjecr coqld "see"_the actual 8e<iuence of moves that would be 
ne ^^ to solve the problem (lu) number of explication of the moves 



"I can work this out_just at a glance — its obvious" 

"I see loj^beal move s" 

c* ^te nm&er of scares ("n^^ers") which did not match in the t^ patterns: 
" ^i^;,— tocatibtti ^cept for '3' in bottom right hand 
corner" 

_ ' _ "I otdy had to deal with 5/16 of the digits" 

d« The amotmt of time it wmld take to solve the problaa: 

"TSok 10 seconds to solve" 

"Took a )*ile to see tiie_ pattern" 
e. the type of moves requ^ to solve the puzzle: 

"SoK cb^licated moves must be made" 

" T^c^ or idsleading moves"^ _ 

'*Need^ a cosdiis^tion of m o ve men ts of sets of numbers including 
agjvliig n^^ber.tfaat ^s in correct spot to a316w for other 

_ taoveaents, then rep t aging at end"^ 

f « How far ^art certain nue^sen were in the two puzzles: 

^ "Ifcnit move nu^ very far" 

''ligsabers in some cases move a great distance" 
g« How msich_thought was retpiired to solve the problem: 
"Required lots bf^ thought" 

"£ haS trouble keepiiig all the moves in my head" 
h* the nnmbi^ bf cdlt^ms libt matching in_ the two patterns: 

- - "Because you bnly have to deal with two bf the four coitimas" 
i. The nx^b« bf r not mtchlng in the two patterns: 

"Two rows aat^ already" - _ _ 

J. The location of x5s& space^te the left pattern: _ 

"Will require using tixe ri^t columns because ijc cbntalns the 
open space" 



itt 



fe. Similar JLl^ to an already solved or rated puzzle: 

"This puzzle ^asier since it resembles one already solved* 

li Whether either tie left or tiie right pattern was in numeric order from 

1 to 15: _ _ ^ 

there were ^nb examples bf this dimension in the voluntary protocols 
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Table F-1 

of^S^peated Hoves for 13 Problems 

Prcbleni 





1 




3 


s 


33 






r 


-.03 






a 


33 






r 


-.03 


i.bb 




N 


54 


33 


33 


r 


-.05 


-.07 


-.07 


N 


5i 


32 


32 


r 


-.04 


-.04 


-.04 


N 


54 


32 - 


. 32 


r 


-.04 


-.06 


-.06 


N 


54 


32 


32 


r 


-.03 




a 




50 


30 


30 


r 


=.04 


-.05 


-.05 


S 


46 


28 


28 


r 


-.07 


-^.08 


-.08 




44 


28 


28 


r 


.08 


.12 


.12 


S 




28 




-T.08 " 


-.08 




48 


28 


28 


r 


-.11 


-.12 


-.12 




49 


29 


2S 


r 


-.03 


.07 


.07 



8 



12 



2 
3 
^4 
5 
6 
7 
8 
9 
10 
11 
12 
13 



.23 



53 


54 














-.09 


.00 














53 


53 


53 












-.05 


-.05 


-.05 












49 


49 


45 


50 










.69 


-.Q5 


-.08 


=.06 










45 


45 


45 


4? 


48 








.13 


.03 


.04 


-.10 


.52 








43 


43 


43 


44 


; 44 


44 






.27 


.40 


.09 


.05 


-il8 


=.14 






48 


48 


48 


45 


48 


45 


43 




.21 


-.07 


-.09 


-.08 


-.60 


-.04 


-.08 




47 


47 ' ' 


47 


48 


47 


44 


42 


47 


.22 


-.04 


.45 


.08 


-.03 


';b3 


.22 


i39 


4^ 


45 


43 


49 


48 


45 


44 


48 


.37 


-.1^^ 


.11 


.21 


.34 


- ;33 


-;04 


^00 



47 
.07 



^rrelatldiis not cbii^mtirf doe to near zero standard deviations. 
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Table F^2 

Grbss-Gbrrelati^r^^^s of Ntimber of Legal Roves XVerclcal) and 
Hamber of lll^^al Hovea (Horlasbntalj for 13 Problems 



Problem 







^ 2 


- 3 


4 


5-^ 




7 


8 


9 


10^ 


11 


12 


13 


N 


5S 


33 


33 


54 


54 


54 


54 


50 


46 


-44 


45 


48 


4? 


r 


,61 


-.14 


-.08 


.01 


-.08 


-.09 


-.08 


-.10 


-.10 


-.14 


-.09 


-.14 


.01 


s 


33 


33 


33 


33 


32 


32 


32 


30 


25 


25 


25 


25 


25 


r 


-il3 


-.09 


.17 


.04 


;01 


;20 


-.03 


-.17 


-.11 


.54 


.35 


.01 


-.07 


N 


33 


33 


33 


33 


32 


32 


32 


30 


25 


25 


S3- 


25 


25 


r 


-.09 


-.14 


.29 


.16 


.11 


-.10 


-.09 


-.12 


.00 


.15 


.06 


-.65 


-.11 


N 


S4 


33 


33 


54 


53 


53 


53 


45 


S5 


43 


45 


47 


45 


r 


-.16 


.14 


.10 


.30 


.50 


.21 


.31 


.00 


.01 


.02 


.13 


.13 


.25 


S 


54 


32 


32 


53 


54 


54 


53 


45 


45 


43 


45 


47 


45 


r . 


-.15 


.30 


-i08 


i07 


' .20 


.24 


.26 


.14 


.15 


.10 


-.06 


-.08 


.15 


9 


64 


32 


32 


53 


54 


54 


53' 


45 


45 


43 


45 


47 


. 45 


r 


-il4 


.16 


-.03 


-.06 


-.08 


.14 


.17 


.38 


.30 


iOS 


-.15 


,02 


.01 


N 


54 


32" 


32 


53 


53 


53 


54 


50 


45 


44 


45 


' ' 48- 


-45- 


r 


-.10 


.25 


.17 


.08 


-.01 


.23 


.68 


.13 


-.11 


.13 


.08 


.03 


.06- 




50 


30 


30- 


4S 


45 


45 


50 


SO 


46 


44 


48 


47 


45: 


r 


-.08 


,.39 


.01 


-.09 


.21 


-.12 


-.03 


.27 


.10 


-.03 


-.06 


.06 


.is 


N 


46 




25 


45 


45 


45 


46 


46 


4i? 


44 


45 


44 


45 


r . 


-.15 


-.05 


.00 


-.29 


-.08 


.03 


.08 


-.04 


.30 


-.01 


-.13 


.12 


-.21 i 


s 


44 


25 • 


25 


43 


43 


43 


44 


44 


44 


44 


43 


42 


44; 


- r 


.17 


.32 


.01 


.25 


.08 


.17 


.49 


-.09 


-.06 


.06 


.03 


-.07 


-.01 


N 


4d 


23 


25 


48 


45 


45 


45 


45 


45 


43 


4S 


47 


45^ 


r 


.07 


.21 


.20 


-.11 


-.01 


.01 


-.04 


.05 


-.06 


.32 


.33 


-.13 


-.03 


S 


45 


25 


25 


47 


47 


47 


45 


47 


44 


'42 


47 


45 




t- 


-.09 


-.08 


.32 


-.28 


.08 


.16 


-.01 


-.13 


.06 


-.14 


-.15 


.11 


-.33 


N 


49 


25 


29 


4d 


45 


45 


45 


45 


45 


44 


45 


47 . 


45 


V 


.17 


.10 


.26 


-.14 


.11 


-.05 


.04 


-.08 


.01 


.02 


-.04 


-.01 


.20 
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Table F-3 . _ _ 

Cross-Coirreiations of Somber of tegal Moves tHbrizbntal) and 
> liia^er of JRepeated Move8^4%rt:±cai) for 13 Problems 



Probles 



6 



8 



10 



11 




54 
.05 

S3 
.60 

33 
.60 

M 
.57 

S3 
.16 

53 
.02 

53 
.05 

.04 

4S 
.01 

^3 
.23 

48 
.01 

4? 
.08 

4S 
.26 



54 
.06 

32 
.03 

32 
;08 

S3 
.16 

S3 
.32 

53 
.08 

53 
.09 

4B 
.08 

35 
.11 

33 
.04 

48 
.15 

4? 
.21 

4S 
.22 



53 
.04 

32 

,m 

32 
.06 

S3 
.06 

S3 
.03 

54 
.79 

S3 
.06 

39 
.08 

35 
.03 

33 
.02 

48 
.09 

37 
.43 

35. 
.02 



S3 
.21. 

32 
-.11 

32 
-.11 

53 
.22 

S3 
.20 

S3 
.13 

54 
-.03 

50 
.12 

45 
.00 

44 
-.12 

3^ 
.07 

48 
-.02 

35 
.14 



Si? 
-.07 

30 
-.08 

30 
-.08 

45 
.26 

35 
.28 

35 
.00 

50 
-.10 

SO 
.44 

46 
.10 

44 
-.06 

4S 
-.01 

47 
-.08 

4a 

-.01 



48 
.13 

23 
.16 

29 
.16 

45 
.07 

45 
.12 

45 
.30 

46 
.18 

45 
.28 

46 
.49 

44 
.23 

45 
.16 

44 
.33 

45 
.12 



44 
.?-2 

2S 
.11 

28 
.11 

43 
.16 

.03 

43 
.20 

33 
.05 

44 
.17 

34 
.15 

44 
.^7 

43 
.07 

42 
.24 

44 
.09 



45 
.02 

28 
.10 

28 
.IC 

48 
.29 

48 
.07 

48 
.09 

45 
.08 

48 
.00 

35 
.16 

43 
.06 

45 
.46 

47 
.07 

48 
.09 



38 
.13 

28 
.14 

28 
.14 

47 
-.21 

37 
.14 

37 
.06 

48 
-.23 

37 
.12 

34 
.20 

42 
-.09 

37 
.06 

33 
.13 

37 
> .03 



^Correlation not computed d»,«s to near zero standard 'deviations . 
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Table F-4 _ 

Crbsa-Cbrrelaticns of Nuifl)er of Illegal Moves (Vertical) and 
Eumber of lepeated Hbves (Ecrizbntal^^or 13 ProbleiBS 
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33 
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£i 
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54 


54 


53 
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i8 
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-.03 
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5^ 


32 


32 


S3 
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Si 


S3 


4^ 


45 


i3 
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47 
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6 
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32 
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SO 
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id 
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-.09 
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.17 
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50 


3d 


30 


#S 


49 


49 


SO 


50 


46 


ii 


48 


i7 


48 


8 
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-.12 


-.12 


.28 
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46 


23 
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45 


i5 


tS 
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46 
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iS 


44 
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ia 


N 


44 


28 


28 


^3 


i3 
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ii 
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28 


28 


48 


48 


48 


i^ 
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4B 


43 


iP 


47 


i8 


11 


r 


-.09 


.06 
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.36 


.02 


-.12 
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-.09 
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id 


2d 


28 


3? 


iP 


4? 


48 


4? 


ii 


i2 


4? 


i8 


i? 
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-.14 
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.47 


.14 


.01 
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.31 
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.43 


13 




49 


25 


29 


id 


49 


id 


id 


48 


45 


ii 


i8 


ir 


49 


r 


.01 


-.It 


-.11 


.49 


.34 


-.12 


.01 


-.09 
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.22 


-.04 


.09 
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^Correlations not computed due to near zero standard deviations*. 
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Perfonoance 
— Measure — 



Indlvidnal 

Score 
T^tal 2 
HRATE 



fable F-5 

Prodact^^«K6rst. dbrreiations Between initial Move Latency 



PrbblenT 



.03 .31 -.08 -.01 -.01 -.11 -.02 -.16 -.23' -.19 -.07 -.03 -.26 
-.17 -.01 -.22 -.15 -.10 -.23 -.03 -.14 -.07 -.18 -.09 -.15 -.16 
.16 -.Oft .21 .09 -.03 .15 .01 .01 .08 .13 .07 .07 fl8 



Table F-6 , : 

Prbduct**M6meht Correlations Between Average Hove Latency 
\ for Each Problem and Performance Measures ^ by Problem _ _ ^_ 

Perfdrsiance -—— ■ ^ — '— — Problem — --^ ' 

_ tfeasiare 1 1 2 3 4 5 6 7 8 9 10 U 1 2 1 3^ 

Individual ° 

Problem : , - - 

Score -.06 .00 -.03' .07 .07 .00 -.01 -.07 -.16 .00 .16 .00 .07 

Ttotal 2 -.11 -.14 -.14 -.05 -.04 -.15 -.10 -.01 -.20 -.15 .10 -.22 -.29 

HRiTE- .09 ^04 . 08 . 02 -.09 . 07 -.01 -oOS .06 . 04 -.12 . 06 .13 
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